Episomal expression cassettes for gene therapy

ABSTRACT

The invention consists of episomal expression cassettes for expression of a transgene in gene therapy. The expression cassettes consist of regulatory elements of the human cytokeratin gene and a transgene. The invention also includes of liposomes for transfection of epithelial tissue with the cassettes in treatment of cystic fibrosis, emphysema, cancers of epithelial origin arising in the lung or other organs.

FIELD OF THE INVENTION

The invention relates to gene therapy episomal expression cassettes toexpress a transgene in epithelial cells.

BACKGROUND OF THE INVENTION 1.1 Gene Delivery

Demonstration of the feasibility of gene transfer to humans by a numberof clinical trials stimulated considerable interest in gene therapy inthe scientific community even though no therapeutic benefit has yet beenoffered to patients (7). Epithelial tissue, particularly lung epithelialtissue, has considerable potential as a target for gene therapy. Thelung is a highly suitable organ for in vivo gene therapy treatment ofpatients with potentially lethal lung disorders, such as cysticfibrosis, cancers of epithelial origin and emphysema because of itslarge accessible epithelial and endothelial surface area (15). Bothvirus-based and non-virus-based methods can be used to deliver genes tolungs (6, 15). The use of liposomes as gene transfer agents seems tohave some significant advantages for in vivo lung gene therapy (6, 15).First, liposomes offer a wide margin of safety with low toxicity andhave already been used to deliver drugs to humans. They can beadministered into the lungs as an aerosol, by direct lavage or followingintravenous injection. A clinical trial in nasal epithelia showed noadverse effects; nasal biopsies showed no immuno-histological changes(4). Secondly, liposome-complexed DNA can be used to transfect bothresting and dividing cells. In addition, large DNA constructs can beaccommodated with liposomes for transfection. Finally and mostimportantly, liposome-mediated gene expression is episomal, therebyavoiding or reducing the risk of random chromosomal insertions. However,one of the major impediments to liposome-mediated in vivo gene therapyis that the currently available expression vectors only offer a very lowlevel of transient transgene expression (15). Therefore, enhancement ofthe therapeutic gene expression would not only increase the efficacy,but also effectively decrease the already low levels of toxicity byreducing the dose of therapeutic reagent.

1.2 Control of Gene Expression

The inefficient expression of transgenes in lung is, at least in part,due to the lack of proper lung-specific gene expression cassettes (15).An ideal expression cassette for human lung gene therapy should be safeand confer an appropriate level of tissue-specific expression for areasonable duration. The rational design of expression cassettes forlung gene therapy relies on our knowledge of regulation of geneexpression. Regulation of eukaryotic gene expression is a verycomplicated process.

A particular gene may be expressed in only one type of cell or tissuewhile others are expressed in most cell types or tissues. For example,cytokeratin genes are expressed predominantly in epithelial cells (26).In contrast, genes encoding proteins involved in translation (proteinsynthesis) are expressed in every cell type. The activity of aeukaryotic gene can be regulated at any stage during the course of itsexpression, such as transcription, RNA splicing, RNA stability,translation, or post-translational modification. Current knowledgeindicates that transcription and RNA splicing are the major steps forregulation of many eukaryotic genes.

1.2a Transcriptional regulation

Transcription of eukaryotic genes is catalyzed by an RNA polymerasewhich is recruited to the promoter by multiple protein factors involvedin transcription initiation. Regulation of transcription can beattributed to tissue-specific DNA elements (enhancers or silencers) thatstimulate or repress transcription through interaction withtissue-specific transcription factors (25). However, these elements maynot function if they reside in an inappropriate location on achromosome, suggesting that chromosomal position and structure alsoaffect gene expression. This has led to discovering a type of regulatoryelements called locus control region (LCR) (13). These LCRs, whenintegrated into chromosomes, confer copy number-dependent andlocation-independent gene expression. The first LCR was discovered in 5′region of the human β-globin gene cluster (9, 10, 13). LCRs are nowknown to be associated with other genes (28, 36) including humancytokeratin 18 and rat LAP (C/EBPb) which direct gene expression in lungcells of transgenic mice (28, 36). Although currently there is noevidence to show that LCRs enhance episomal gene expression, thispossibility can not be ruled out since information about theinteractions of LCRs with other regulatory elements is still limited. IfLCRs increase gene expression, they would be useful in the design ofepisomal expression cassettes. As lung epithelial cells are not activelydividing, the delivered plasmid DNA may be wrapped by histones or othernuclear factors and kept in a transcriptionally inactive conformation.Although it is generally believed that plasmids when transferred intonucleus do not form chromatin structures, recent experiments by Jeongand Stein demonstrate that some of the transfected DNAs are in chromatinform (17). The presence of a functional LCR in expression cassettes mayallow a plasmid to stay in an open conformation.

1.2b Regulation through RNA processing

Regulation of RNA splicing is also very important for tissue-specificand developmentally regulated gene expression (35). This type ofregulated RNA splicing or alternative RNA splicing can lead to theproduction of different proteins from a single gene by inclusion ofdifferent exons in different mRNAs. Some introns contain strongenhancers and their exclusion from expression constructs would lead todiminished gene expression. For example, the first intron of the humancytokeratin 18 contains a strong enhancer which is required forexpression of the cytokeratin 18 gene (29). Other introns that do notcontain enhancers may also affect gene expression. For example, thepresence of rpL32 intron 3 leads to a 30-fold increase in mRNA levelrelative to the intronless rpL32 minigene (21). However, differentintrons clearly have different effects. For instance, inclusion ofintact thymidylate synthase gene intron 4 alone at its normal positionin the thymidylate synthase (TS) coding region leads to a decrease inthe level of expression relative to that observed with a the intronlessTS minigene (21). The details of this splicing regulation of expressionare unknown.

1.3 Gene expression in lung enithelial cells

Efficient tissue-specific gene expression can be achieved, in theory, byusing tissue-specific promoters, promoter elements, RNA processingsignals, and tissue-specific RNA-stabilizing elements. Cell-specificgene expression primarily results from either tissue-specific promoters,and/or tissue-specific regulatory elements, such as enhancers,silencers, and locus control regions (LCRs). However, it is verydifficult to design a cassette for lung gene therapy because there isnot enough information known about regulation of lung gene expression.Currently, no suitable expression vector for lung gene therapy has beenreported. There is a pressing need for an effective expression vectorbecause a number of human CF gene therapy trials have been conducted(7). The SV40 promoter was used to direct CFTR expression in theclinical trial by Caplen et al. (4); we observed that SV40 promoter isnot very active even in cultured lung epithelial cells (see FIG. 5) andits expression in rat lung primary cells (the primary cells are firstgeneration cells isolated from the rat lung, i.e. they are notimmortalized cell lines cultured for many generations) is undetectable(Plumb and Hu, unpublished results). That might explain the largeamounts of plasmid DNA (10 mg to 300 mg/per nostril) used in the study(4). Recently, several cis-acting elements and trans-acting factorsregulating lung epithelial gene expression have been identified. Thepromoters of the SP-A (surfactant protein A), SP-B (surfactant proteinB), SP-C (surfactant protein C), SP-D (surfactant protein D) and CC10(Clara cell 10 kD protein) genes have been extensively analyzed (22, 31,40, 41). Because these genes are predominately expressed in type II orClara cells (22), their promoters, unless modified, would not likely besuitable for expressing genes in epithelial cells of conducting airways,which represent the primary target for CF lung gene therapy.

1.4 Epithelial Expression Cassette for Lung Gene Therapy

Because of the low efficiency in liposome-mediated gene expression,strong viral promoters are often used in gene therapy studies. However,this may not be the ideal approach for liposome-mediated lung genetherapy. For example, the CMV major immediate early gene promoter hasbeen shown to be very strong for transient expression of transgenes incultured cells, but two studies have shown it to be a poor promoter forlung gene expression in transgenic mice (1, 33). There is no evidence toshow the CMV promoter can confer sustained episomal gene expression invivo. Although it is unreasonable to expect a permanent transgeneexpression from an episomal plasmid, long lasting expression even at alow level may offer considerable clinical benefits to gene therapypatients. In addition, viral promoters may not confertissue-specificity. Since currently the nuclear uptake of delivered DNAis highly inefficient (44) in addition to the low efficiency ofliposome-mediated gene expression, no one would worry about the effectof non-specifically expressing a therapeutic gene in vivo. However, whenthe nuclear uptake and liposome-delivery technology are improved, thishas to be seriously considered because there must be an advantage fornature to select genes, such as the cystic fibrosis transmembraneconductance regulatory gene (CFTR), to be epithelium-specific.

If human DNA regulatory elements could direct tissue-specific expressionof therapeutic genes at a comparable level to that from strong viralpromoters in lung epithelial cells, and sustain gene expression longerthan the viral promoters, it would be advantageous to use them for lunggene therapy. At present, there is no suitable expression vector forepithelial tissue gene therapy. There is a need to develop gene therapycassettes that use human DNA regulatory elements which naturally expressgenes in epithelial cells and can be used to direct the expression oftherapeutic genes. It would be particularly useful if there was anexpression cassette that could direct a high level of reporter geneexpression in vivo and in vitro. The expression cassette should be safeand confer an appropriate level of tissue-specific expression for areasonable duration. The expression cassette should be capable of use inepithelial cells, such as submucosal cells.

BRIEF DESCRIPTION OF THE DRAWINGS

Preferred embodiments of the invention will be described in relation tothe drawings in which:

“FIGS. 1A-1B Improvement of GUS Reporter Gene System. FIG. 1A showsmodification of the GUS reporter gene. FIG. 1B shows chemiluminescentassays of GUS gene expression. RFLE, rat fetal lung primary epithelialcells. The Bioorbit Luminometer (model number 1253) was used and 1reading unit from this model equals 10,000 reading units from othermodels such as Berthold Lumat LB 9501.

FIGS. 2A-2B Expression of GLP reporter in A547, IB3 and rat fetal lungprimary epithelial cells. FIG. 2A shows expression of the greenfluorescent protein in cultured human lung cells. A549 cells weretransfected with pGREENLANTERN-1 (GIBCO BRL) and visualized under afluorescent microscope (bottom panel) 2 days post-transfection. Thephase-contrast view of the same cells is shown in the top panel. FIG. 2Bshows expression of the green fluorescent protein in human cysticfibrosis bronchial epithelial cells. IB3 cells were transfected withpGREENLANTERN-1 (GIBCO BRL) and visualized under a fluorescentmicroscope (bottom panel) 2 days post-transfection. The phase-contrastview of the same cells is shown in the top panel. FIG. 2C showsexpression of the green fluorescent protein in rat lung primary cells.Rat fetal lung epithelial cells were transfected with pGREENLANTERN-1(GIBCO BRL) and visualized under a fluorescent microscope (bottom panel)2 days post-transfection. The phase-contrast view of the same cells isshown in the top panel.”

FIG. 3 Optimization of cell transfection conditions for gene expression.Cells were transfected with pCEP4SEAP complexed with DODAC:DOPE at 2.5nmol/cm² . Each sample corresponds to 50 μl of culture mediumconditioned by the transfected cells.

FIGS. 4A-4C Schematic diagrams of SEAP and CFTR expression constructs.FIG. 4A shows the genomic structure of the human cytokeratin 18 gene(K18), in which exons 1 through 7 are depicted as solid boxes and DNAseI hypersensitive sites as arrows. Intron-1 fragment covers from the endof exon 1 to the beginning of exon 2. The minimal promoter fragmentspans 310 base pairs between a unique Xhol (X) site and the K18translation initiation, excluding the start codon. Enhancer-long andEnhancer fragments cover regions from Hind III (H) to Nsi I sites andfrom Nsi I to Xho I sites, respectively. FIG. 4B shows the simplifiedstructures of the promoterless SEAP construct (CloneTech) and a seriesof its derivatives which contain various segments of K18 untranslatedsequence, as well as their relative expression levels. FIG. 4C shows thestructure of K18Epilong TECFTR, which is identical to K18EpilongSEAPexcept that the reporter gene SEAP is replaced by CFTR cDNA with atranslational enhancer (adapted from Alfalfa Mosaic Virus RNA4)immediately upstream of the CFTR coding sequence.”

FIG. 5 Expression pattern of K18 constructs in comparison with SV40- orCMV-promoter directed expression in selected cell lines. A549, WI38, orCOS-7 cells are transfected with DNA-lipid complex in parallel. Culturemedia were collected and assayed for SEAP activity 48 hrpost-transfection.

FIG. 6 Long lasting gene expression in cells transfected withK18EpilongSEAP. Shown are expression kinetics of K18EpilongSEAP versusCMVSEAP. A549 and COS-7 cells were transfected with DNA:lipid mix at1:10 ratio. Culture media were collected at days post-transfection asindicated, prior to media change, and stored at −80° C. SEAP reporterassay was performed according to standard procedure.

FIG. 7 Expression of K18EpilongSEAP in rat fetal lung primary cells.Both epithelial cells and fibroblast cells were transfected with plasmidDNA and DODAC:DOPE at 1:10 ratio. The plasmid, pINXCAT was used as anegative control. pCEP4SEAP contains the CMV promoter.

FIG. 8 I-Efflux of COS7 cells transfected with CFTR expressioncassettes. Functional analysis for CFTR by iodide efflux assay. COS-7cells were transfected with K18EpilongTECFTR, pCMVnot6.2CFTR as apositive control, or a negative control plasmid. 48 hrpost-transfection, cells were loaded with iodide for one hour followedby extensive washes. AMP-dependent channel activity was then assessed asiodide concentration in the wells before and following the addition ofthe agonist, forskolin, at 0 time point.

“FIGS. 9A-9C Splicing of the K18-CFTR chimeric RNA transcript. FIG. 9Ashows a schematic diagram of the K18-CFTR chimeric RNA transcript andpositions of the primers used in RT-PCR. FIG. 9B shows RT-PCR productsfrom total RNAs isolated from the CFTR transfected IB3 and rat fetalprimary epithelial cells. The types of cells and primer sets areindicated on the top. Lane 4 shows the 1 kb ladder. The 712- and 640-bpbands are the expected PCR products from these two primer sets. Thestars indicate the mis-spliced products. RNAs from untransfected cellsdo not yield any bands (data not shown). FIG. 9C shows the K18 intron 1sequences critical for splicing [SEQ ID NO:20].

FIGS. 10A-10C Identification of the cryptic 3′ splice-sties in the CFTRcoding region and improvement of the splicing efficiency of the k18-CFTRchimeric RNA transcript by mutagenesis. FIG. 10A shows a schematicdiagram of the structures of k18Epilong TECFTR and the RNA transcript.Primers used for RT-PCR in FIG. 11 are depicted as arrows. FIG. 10Bshows DNA sequences of k18Epilong TECFTR at K18 3′ splice site and twocryptic splice sites in the CFTR coding region [SEQ. ID NOS. 21 to 23].FIG. 10C shows DNA sequences of K18mCFTR at respective sites (nucleotidenos. 3293-3325, 3398-3408 and 3611-3626 of [SEQ ID NO:1]). Mutationsintroduced are indicated by asterisks.”

FIG. 11 Splicing patterns of K18-CFTR chimeric RNA transcripts. Shownare PCR products from reverse-transcribed (RT+) total RNAs isolated fromA549 cells transfected with the indicated plasmids. Thecorrectly-spliced transcript yields a 696 bp band, which is the onlyspecies in k18mCFTR transfected cells. In K18EpilongTECFTR transfectedcells, two faster-migrating species, corresponding to splicing productsutilizing the cryptic splice sites in the CFTR coding region, arepresent along with the 696 bp band.

FIG. 12 Functional analysis of the CFTR channel activity by iodideefflux assay. COS-7 cells were transfected with K18EpilongTECFTR,K18EpilongmCFTR, pCDM8.1CFTR [SEQ ID NO:1] as a positive control, or anegative control plasmid (K18Epilong). Forty-eight hourspost-transfection, cells were loaded with iodide for one hour followedby extensive washes. cAMP-dependent channel activity was then assessedas iodide concentration in the wells before and following the additionof the agonist, forskolin, at 0 time point.

FIG. 13 Targeting expression of the LacZ reporter gene in mouse lungepithelia.

The lung was dissected out from a 14 day transgenic mouse fetus andstained with X-gal for 3 hr. The K18mLacZ has been demonstrated clearlyexpressing in airways of the lung.

FIG. 14 A lung of a normal mouse fetus. The lung was excised out from a14 day mouse fetus and stained with X-gal overnight.

FIG. 15 Enhancer activity of the 1.4 kb DNA fragment from 5′ region ofthe human K18 gene. A549 cells and COS-1 cells were transfected withK18EpiSEAP or K18EpilongSEAP which contains the distal enhancer. SEAPactivities in the culture media are normalized to total protein.

FIG. 16 The position effect of K18 intron 1 on reporter gene expression.In K18EpilongSEAPi construct, the intron was moved to the down stream ofthe SEAP coding region. Relocating this intron abolished the expressionof the SEAP reporter gene by the construct.

FIG. 17 Temporal expression of the lacZ reporter gene in lung airways ofthe transgenic fetuses. A negative control lung stained with X-gal underthe same conditions is shown on the left, at each time point.

FIG. 18 Submucosal expression of the lacZ reporter gene. The left panelshows a horizontal tissue section from the lower part of the trachea ofa control mouse. The middle panel shows a horizontal tissue section fromthe lower part of the trachea of a transgenic mouse. The right panelshows submucosal expression of the reporter gene in a tissue section ofthe upper part of the trachea from the same transgenic mouse.

FIG. 19 K18EpilongmCFTR [SEQ ID NO:1] restriction map.

FIGS. 20A-20C (a) DNA sequence of K18EpilongmTELacZ [SEQ ID NO:19] (b)restriction map; (c) features.

FIG. 21 Enhancement of human CC10 expression by intron 1 of the humancytokeratin 18 gene in A549, a cell line of human lung carcinoma origin.

SUMMARY OF THE INVENTION

The invention satisfies the need for a suitable expression vector forepithelial tissue gene therapy. The expression cassettes of thisinvention contain human DNA regulatory elements which naturally expressgenes in epithelial cells and direct the expression of therapeuticgenes. For example, the regulatory elements may be from the humancytokeratin 18 gene. The expression cassettes also direct a high levelof reporter gene expression in vivo and in vitro. The expressioncassettes are safe and confer an appropriate level of tissue-specificexpression for a reasonable duration. The expression cassettes may beused in epithelial cells, such as submucosal cells.

The invention also satisfies the need for expression cassettes usingepithelial cell specific regulatory elements from mammals. In apreferred embodiment, the human cytokeratin 18 gene regulatory elementsare used. Certain elements of the cytokeratin 18 gene from other mammalsmay also be beneficially used with the expression cassettes.

The invention also relates to a host cell (isolated cell in vitro or acell in vivo) containing a DNA sequence including: an expressioncassette of the invention and the DNA sequence of a gene to beexpressed. In a preferred embodiment, the DNA sequence is operativelylinked to the expression cassette and capable of expression in the celland the DNA sequence encodes a protein selected from the groupconsisting of: 1) a CFTR protein; 2) a protein having sequencesimilarity to CFTR and 3) a protein having CFTR activity.

The invention is an expression cassette for the episomal expression of atransgene in targeted epithelial cells, which consists of regulatoryelements of the human cytokeratin gene and a transgene. In oneembodiment of the invention, the expression cassette is targeted to alung epithelial cell. The regulatory elements may comprise a promoter,the 5′ region and modified intron 1 of the human cytokeratin 18 gene.

In the cassette, the human cytokeratin gene is the human cytokeratin 18gene. The regulatory elements of the cassette are from the 5′ region ofthe human cytokeratin 18 gene. The regulatory elements may also consistof a promoter, the 5′ region and intron 1 of the human cytokeratin 18gene. The cassette may also contain an enhancer.

The transgene in the cassette can be the cystic fibrosis transmembraneconductance regulatory gene. In another embodiment, the transgene in thecassette can consist of an enhancer and a modified cystic fibrosistransmembrane conductance regulatory (CFTR) gene.

The cells targeted by the cassette may be epithelial cells, such assubmucosal cells.

A liposome may be used to deliver the expression cassette construct.

Cells may be transfected by the expression cassette construct. In oneembodiment, the cells are part of tissue in a lung.

The invention also includes a method of treating a patient having a lungdisorder, by administering to the patient a liposome containing thecassette so that the cassette transfects a targeted lung cell. Themethod of administration of the liposome may be selected from a groupconsisting of aerosol administration, intratracheal instillation andintravenous injection. The expression cassette can be used in treatmentof a disorder such as cystic fibrosis, emphysema, and cancers ofepithelial origin arising in the lung or other organs.

Another aspect of the invention relates to an expression cassette forthe episomal expression of a transgene in a targeted epithelial cell,consisting of: regulatory elements of a human gene, and a transgeneoperatively associated with the regulatory elements and capable ofexpression in the epithelial cell.

The invention also relates to an expression cassette for the episomalexpression of a transgene in a targeted epithelial cell, consisting of:regulatory elements of a cytokeratin gene, and a transgene operativelyassociated with the regulatory elements and capable of expression in theepithelial cell. In a preferred embodiment, the epithelial cell is alung epithelial cell. The human gene is preferably a cytokeratin gene.The cytokeratin gene is preferably a mammalian cytokeratin gene. Thecytokeratin gene is preferably the human cytokeratin 18 gene.

The regulatory elements are preferably from the 5′ region of the humancytokeratin 18 gene (all or part of the 5′ region includingmodifications thereto, provided the cassette is fictional). In anotherembodiment, the regulatory elements comprise a promoter, the 5′ region(or modified 5′ region, provided the cassette is functional) and intron1 (or modified intron 1, provided the cassette is functional) of thehuman cytokeratin 18 gene.

In another embodiment, the cassette consists of a promoter, the 5′region (or modified 5′ region provided the cassette is functional) andmodified intron 1 of the human cytokeratin 18 gene. In anotherembodiment, the cassette may comprise an enhancer.

The transgene is preferably selected from the group consisting of acystic fibrosis transmembrane conductance regulatory (CFTR) gene, a genehaving at least 70% sequence identity with CFTR and encoding a proteinhaving CFTR activity, and a gene encoding a protein having CFTRactivity. In another embodiment, the transgene comprises an enhancer anda modified cystic fibrosis transmembrane conductance regulatory (CFTR)gene. The targeted epithelial cell is preferably a submucosal cell.

The invention also includes a liposome comprising the construct (orexpression cassette). The invention also includes a transfected cellcomprising the construct of claim 1, claim 2 or claim 6 and lung tissuecomprising the cell of claim 15.

The invention also relates to an expression cassette for treating adefect in the CFTR gene in a target epithelial cell, the expressioncassette comprising: the DNA of or corresponding to at least a portionof the DNA regulatory elements of a cytokeratin gene which DNA iscapable of regulating gene expression in the target epithelial cell; anda gene, operatively associated with the expression cassette elements andcapable of expression in the epithelial cell, the gene encoding aprotein selected from the group consisting of a CFTR protein; a proteinhaving at least 70% sequence identity with the CFTR protein and havingCFTR activity and a protein having CFTR activity.

In alternate embodiments, the protein has at least 80%, at least 90%, atleast 95%, at least 96%, at least 97%, at least 98%, and most preferablyat least 99% sequence identity with the CFTR protein and has CFTRactivity.

The expression cassette may comprise the DNA sequence in [SEQ ID NO. 1]) or the sequence shown in FIG. 19, or a modification or fragment ofthese sequences. In another embodiment, the expression cassette has atleast 70% sequence identity to the sequence in [SEQ ID NO: 1]. In oneembodiment of the invention, the defect being treated with the cassettecauses cystic fibrosis. The target cell is preferably a lung epithelialcell. The cytokeratin gene is preferably cytokeratin 18. The cytokeratingene is preferably a human cytokeratin gene (or a cytokeratin gene fromanother mammal). The DNA regulatory elements from the cytokeratin 18gene are preferably selected from the group consisting of: a promoter,the 5′ region and modified intron 1.

Another aspect of the invention relates to an epithelial cell containingrecombinant human DNA regulatory elements and a gene operativelyassociated with the regulatory elements, the cell expressing proteinsnot normally expressed by the cell at biologically significant levels.The DNA regulatory elements preferably comprise cytokeratin DNAregulatory elements. The cytokeratin is preferably cytokeratin 18 (froma human or another mammal).

The DNA regulatory elements are preferably selected from the groupconsisting of: a promoter, the 5′ region and modified intron 1. The cellis preferably a human epithelial cell. In another embodiment, the cellis a human cystic fibrosis-associated cell.

The cell preferably contains a gene expressing a protein selected fromthe group consisting of: a CFTR protein, a protein at least 70% sequenceidentity with the CFTR protein and having CFTR activity; and a proteinhaving CFTR activity.

The invention also relates to an epithelial cell containing recombinantcytokeratin DNA regulatory elements and a gene operatively associatedwith the regulatory elements, the cell expressing a protein not normallyexpressed by the cell at biologically significant levels. Thecytokeratin is preferably cytokeratin 18 from a human (or anothermammal).

The DNA regulatory elements are preferably selected from the groupconsisting of: a promoter, the 5′ region and modified intron 1 (orfragments or modifications of these regions). The cell is preferably ahuman epithelial cell. In another embodiment, the cell is a human cysticfibrosis-associated cell. The cell has a gene preferably expressing aprotein selected from the group consisting of: a CFTR protein, a proteinhaving at least 70% sequence identity with the CFTR protein and havingCFTR activity; a protein having CFTR activity.

The invention also includes a method of treating a patient having a lungdisorder, comprising administering to the patient a liposome containingthe cassette of the invention whereby the cassette transfects targetedlung cells. The method of administration is preferably selected from agroup consisting of aerosol administration, intratrachel instillationand intravenous injection. The disorder treated includes cysticfibrosis, cancers of epithelial origin and emphysema.

The invention also includes a method for treating a defect in a gene ina target epithelial cell, consisting of: administering to the epithelialcell an amount of an expression cassette of the invention so that theexpression cassette is inserted in the epithelial cell and expressingthe gene to produce the protein.

The invention also includes a method for treating defective chloride iontransport in a cystic fibrosis-associated epithelal cell in a subjecthaving cystic fibrosis, consisting of: administering to the epithelialcell an amount of an expression cassette of the invention so that theexpression cassette is inserted in the epithelial cell; expressing thegene to produce the protein so that the protein is transported to theplasma membrane and generates chloride channels in the cysticfibrosis-associated epithelial cell of the subject.

The invention also relates to a pharmaceutical composition comprising atherapeutically effective amount of the expression cassette and apharmaceutically acceptable carrier. The invention also relates to acomposition comprising the expression cassette and a carrier.

Another aspect relates to the use of the expression cassette fortreatment of a disease, disorder or abnormal physical state selectedfrom a group consisting of cystic fibrosis, cancers of epithelial originand emphysema.

DETAILED DESCRIPTION OF THE INVENTION

This invention relates to expression cassettes for expressingtherapeutic genes or other genes of interest in epithelial cells. Theexpression cassettes are preferably constructed from human DNAregulatory elements that naturally express genes in epithelial cells todirect the expression of transgenes for use in research, proteinproduction and gene therapy in lung and other organs. The expressioncassettes use human DNA regulatory elements that are specificallyexpressed in epithelial cells to provide high levels of proteinexpression.

The expression cassettes may be used in vivo or in vitro. Epithelialcells transformed in vitro can be used as a research tool or for proteinproduction. The expression cassettes are also useful for gene therapy bytransforming cells in vivo to express a therapeutic protein. Genetherapy may be used to treat diseases such as cystic fibrosis, cancersof epithelial origin or emphysema. For example, if one were toupregulate the expression of a gene, one could insert the sense sequenceinto the expression cassette. If one were to downregulate the expressionof the gene, one could insert the antisense sequence into the expressioncassette. Techniques for inserting sense and antisense sequences (orfragments of these sequences) would be apparent to those skilled in theart. The gene or gene fragment may be either isolated from a nativesource (in sense or antisense orientations), synthesized, a mutatednative or synthetic sequence or a combination of these.

When the DNA regulatory elements used in the episomal expressioncassettes are from human genome, these elements offer bettercompatibility for human gene therapy because the authentic proteinfactors interacting with these DNA elements are present in targetedcells. These cassettes are epithelium-specific and highly efficient; thecell-specificity increases the efficacy and avoids any adverse effectsresulting from expression of the therapeutic gene in non-targeted cells.The high efficiency of gene expression is also critical to minimize thedosage of the therapeutic reagents from gene therapy. Additionally, evenin cultured cells, the expression from these constructs last longer thanthe viral promoter based-expression cassette (see FIG. 6).

The expression cassettes of the invention may be used to treat fataldiseases, such as cystic fibrosis which are caused by geneticabnormalities in epithelial cells. The expression of the cystic fibrosistransmembrane conductance regulator (CFTR) gene in human lungs afterbirth is localized predominately in the epithelial cells of trachea andlarge bronchi (37), especially in the submucosal cells (11, 12). Incystic fybrosis patients, death may result from lung failure caused bythe genetic abnormality.

In a preferred embodiment, the human gene regulatory elements are fromthe cytokeratin 18 gene. In another embodiment of the invention,cytokeratin 18 gene regulatory elements from other mammals may be usedalone or in combination with human cytokeratin 18 gene regulatoryelements. Other epithelial cell specific DNA regulatory elements mayalso be combined in the expression cassettes of the invention.

The following steps are preferably used to design a CFTR expressioncassette: first generating a series of DNA constructs that were assessedin cell lines for the expression of reporter genes or the human CFTRgene, then examining selected constructs in primary cells and wholetissue sections, and finally testing selected constructs in mice andhumans. In a preferred embodiment, the expression cassette directs ahigh level of reporter gene expression in human epithelial cells in vivoand in vitro and in rat fetal lung primary epithelial cells. Thecassette may be modified to efficiently direct expression of the humanCFTR gene with a change in the CFTR coding sequence. The modifiedexpression cassette directs efficient and cell-specific gene expressionin lung epithelia of the transgenic mice and human epithelial cells invivo and in vitro.

Episomal Expression Cassettes

The advantages of using human regulatory elements in the expressioncassettes of the invention are described above. In a preferredembodiment of the invention, the human regulatory elements are from ahuman cytokeratin gene and most preferably from the human cytokeratin 18gene. These elements control the expression of a transgene expressed inepithelial cells. The specific regulatory elements chosen for aparticular cassette may vary depending on factors such as the level ofactivity of the cassette desired or the characteristics of the gene tobe expressed. One skilled in the art can modify the sequences of theregulatory elements and the gene to be expressed using techniquesdisclosed in this application and known in the art.

Designing an Expression Cassette

K18EpilongSEAP (the construct K18Epilong plus the reporter system SEAP)directed a high level of the SEAP marker in rat lung primary cells (FIG.7). Regulatory elements were derived from the human cytokeratin 18 geneand combined to form the K18Epilong sequence. The SEAP reporter genesystem (see Example 1) was inserted into the cassette to measure levelsof expression in epithelial cells. K18Epilong included the followingcytokeratin 18 gene regulatory elements: 1) intron 1 (which contains astrong enhancer), 2) the K18 promoter, and 3) two 5′ fragments (whichgreatly enhance the level of gene expression). These elements were keptin their original configuration in K18EpilongSEAP as much as possible inthe expression cassette, however, other configurations may be used.Keeping the elements in their original configuration is preferable,where possible, to preserve interactions among transcription factorsbound to these elements.

Many modifications may be made to the expression cassette DNA sequenceand these will be apparent to one skilled in the art. The inventionincludes nucleotide modifications of the sequences disclosed in thisapplication (or fragments thereof) that are capable of expressing genesin epithelial cells. For example, the. K18Epilong sequence may bemodified or a gene to be expressed may be modified using techniquesknown in the art. Modifications include substitution, insertion ordeletion of nucleotides or altering the relative positions or order ofnucleotides. The invention includes DNA which has a sequence withsufficient identity to a nucleotide sequence described in thisapplication to hybridize under stringent hybridization conditions(hybridization techniques are well known in the art). The expressioncassettes of the invention also include expression cassettes (or afragment thereof) with nucleotide sequences having at least 70%identity, at least 80% identity, at least 90% identity, at least 95%identity, at least 96% identity, at least 97% identity, at least 98%identity or, most preferred, at least 99% identity to the expressioncassette sequences of the invention, or a fragment thereof, includingk18pSEAP, K18iSEAP, K18ESEAP, K18EpSEAP, K18EpiSEAP, K18EpilongSEAP,K18EplongSEAP, K18EplongSEAPi; K18Epilong, K18EpilongTECFTR,K18EpilongmCFTR (pK18mCFTR), [SEQ ID NO:1]), pCC10SEAPII,pCC10K18ISEAPII, pCC10K18I and K18EpilongmTELacZ [SEQ ID NO:1] (FIG.4c). The invention also includes nucleotide modifications of theaforementioned sequences that are capable of expressing genes inepithelial cells. Identity refers to the similarity of two nucleotidesequences that are aligned so that the highest order match is obtained.Identity is calculated according to methods known in the art. Forexample, if a nuceotide sequence (called “Sequence A”) has 90% identityto a portion of [SEQ ID NO: 1], then Sequence A will be identical to thereferenced portion of [SEQ ID NO: 1] except that Sequence A may includeup to 10 point mutations (such as deletions or substitutions with othernucleotides) per each 100 nucleotides of the referenced portion of [SEQID NO: 1].

The invention also includes fragments of the sequences, for examplefragments comprising two or more of the human regulatory elements of theinvention which are operatively combined for expression in epithelialcells. The invention also includes DNA sequences which are complementaryto the aforementioned sequences. One skilled in the art would alsoappreciate that as other regulatory elements in cytokeratin 18 or CFTRare identified, these may be used with the expression cassettes of theinvention. Regulatory elements from other genes are also used. As well,regulatory elements from the cytokeratin 18 gene in mammals other thanhumans could be inserted in the cassette provided that adequate geneexpression still occurs. Other genes similar to the cytokeratin 18 genemay also be used in the expression cassettes. For example, thecytokeratin 18 gene and the cytokeratin 8 gene are expressed in pairs inhumans so certain regulatory elements from the cytokeratin 8 gene couldbe used in the cassettes in addition to, or in place of, cytokeratin 18regulatory elements. Reguatory elements from other cytokeratin genes arealso useful sources of regulatory elements for the expression cassettesof the invention (human cytokeratin genes 1 to 19 are known; cytokeratingenes that are only expressed in skin are less likely to be useful forexpression cassettes in lung epithelial tissue). Regulatory elements andsequences of other genes, such as cytokeratin genes, are known in theart. These regulatory elements may easily be inserted in expressioncassettes of the invention and the levels of expression measured. Forexample, sequences from other cytokeratin genes or cytokeratin 18 genesfrom other mammals having a high level of sequence identity to the humanregulatory elements used in the expression cassettes of the invention(such as cytokeratin 18 regulatory elements) may be easily identified byreviewing sequences from a database, such as Genbank. Suitable sequencespreferably have at least 70% identity, at least 80% identity, at least90% identity, at least 95% identity, at least 96% identity, at least 97%identity, at least 98% identity or most preferably have at least 99%identity to the sequence of a regulatory element (such as a cytokeratin18 regulatory element) used in the cassettes of the invention disclosedin this application (or a fragment thereof).

Regulatory elements from other genes can be inserted in the expressioncassettes. For example, the CC10 promoter (the CC10 sequence isavailable from the Genbank database; the sequence and other informationrelating to it in the Genbank database is incorporated by reference)could be substituted in a cassette in place of the K18 promoter.Combinations of regulatory regions can be used to vary the levels ofprotein production and/ or to obtain more cell specific expression. Thetechniques described to produce an expression cassette with humanregulatory elements from the cytokeratin 18 gene may also be used toproduce expression cassettes from other genes. Clara cells arenonciliated secretory epithelial cells in the lung airways and they arebelieved to be important in metabolism of xenobiotics and regenerationof the airway epithelium (Boyd, M. R. 1977, Nature, 269:713-715; Singhet al. 1990, Biuochimica et Biophysica Acta 1039:348-355). Their majorsecretory product is called Clara cell 10 kD protein, or CC10, which isimplicated in regulation of lung inflammation. We isolated a 3.3 kb DNAsequence corresponding to the promoter and the upstream region of thehuman CC10 gene by PCR-cloning. We inserted this DNA fragment intopSEAPII vector (Tropix) to create the plasmid, pCC10SEAPII, forexpressing the Secreted Alkaline Phosphatase (SEAP) reporter gene. Toenhance gene expression driven by the CC10 promoter/enhancer innon-Clara cells of lung epithelia, we built the construct,pCC10K18ISEAPII, by inserting intron I of the human K18 gene to theupstream of the SEAP coding sequence. The levels of reporter geneactivity of these constructs were assayed with A549 cells. As shown inFIG. 3, addition of the K18 intron greatly enhanced the activity of theCC10 promoter/enhancer in A549 cells where the CC10 gene is normally notexpressed very well. This broadening of cell-specificity to other lungepithelial cells could be quite useful. We also demonstrated here thatthe K18 intron can be functional when combined with the human CC10 genepromoter. The combination expression cassette may be further modifiedaccording to techniques apparent to one skilled in the art. Theexpression cassette may be used as a research tool or for proteinproduction.

Other regulatory elements that control expression of the CC10 gene mayalso be used to produce an expression cassette. A clear advantage ofusing an expression cassette derived from CC10 and the gene it expressesis that cell specific expression may be obtained. Other genes may havehuman regulatory elements that are desirable to obtain less cellspecific expression.

The DNA sequences of the invention (regulatory element sequences andtherapeutic gene sequences) may be obtained from a cDNA library, forexample using expressed sequence tag analysis. The nucleotide moleculescan also be obtained from other sources known in the art such as genomicDNA libraries or synthesis.

CFTR Expression Cassettes

The examples below show how to make an expression cassette which may beused for CFTR (or its variants or other proteins with CFTR activity) invivo and in vitro. The examples below are primarily directed to increasethe expression of CFTR. However, there are situations where lower levelsof CFTR expression are desired, such as when using the cassette as aresearch tool. In such a case, the expression of the cassette may bealtered by incorporating only some of the features of the expressioncassettes described below. The level of expression activity of amodified construct can be measured in an animal model according tomethods known in the prior art. Additional changes in the regulatoryelements described below may be made in the cassette to alter the levelof activity. Regardless of which of the regulatory elements below areused in a cassette, or which additional elements or changes are added tothe cassette described below, any variants using human regulatoryelements to express genes in epithelial cells are included within thescope of the invention. Any variants using regulatory elements fromcytokeratin genes specifically expressed in epithelial cells of anymammal, particularly the cytokeratin 18 gene, are also included withinthe invention.

For example, the K18EpilongmCFTR [SEQ ID NO:1] expression cassette,discussed below, is customized to produce high levels of CFTR inepithelial cells. Human cytokeratin 18 gene elements were combined withthe CFTR gene. Both the regulatory elements in the K18Epilong sequenceand the gene were modified as described in the examples below tomaximize CFTR production by K18EpilongmCFTR [SEQ ID NO:1].

K18Epilong was customized to produce a CFTR expression cassette calledK18EpilongTECFTR by inserting CFTR cDNA (FIG. 4c). The CFTR genesequence was manipulated to enhance CFTR protein synthesis. Atranslational enhancer was added to the 5′ end of the CFTR codingsequence and the translation initiation sequence was optimized accordingto the Kozak sequence.

The K18EpilongTECFTR cassette was further modified to increase CFTRprotein synthesis. The modified expression cassette, calledK18EpilongmCFTR (or pk18mCFTR) [SEQ ID NO: 1], caused improved RNAsplicing efficiency and removed undesired RNA splice sites. In thiscassette, the DNA sequence corresponding to the polypyrimidine tract ofthe K18 intron 1 was modified by changing the non-coding region of theCFTR gene. Five cytosine residues and three adenine residues wereconverted into thymine residues. These are translated into uracil in thepre-mRNA sequence (FIG. 10C). The 3′ splice site of the K18 intron wasmodified by changing the first nucleotide, adenine, of the followingexon to guanine (FIG. 10C). The coding region of the CFTR region wasalso altered to destroy the second cryptic 3′ splice site. This was doneby making a single nucleotide change (adenine to guanine; FIG. 10C)which did not alter the protein sequence. A restriction map fork18EpilongmCFTR [SEQ ID NO:1] is shown in FIG. 19. The table belowdescribes the K18EpilongmCFTR [SEQ ID NO:1] expression cassette.

Nucleotide Numbers (1-12143 bp) Sequence Description 1 (Kpn I)-2565 (MluI) K 18 5′ enhancer and promoter 2566 (Mlu I)-3315 K 18 intron 13316-3354 (Nco I) Translational enhancer (TE) 3355 (Nco I)-7955 (Pst I)CFTR (Translation starts with ATG in Nco I) 7956-9283 (Sal I) SVS40small + Ag intron + SV40 early poly A signal 9284 (SaI I)-12143 pSEAP(Tropix) backbone

It is clear that the CFTR sequence can be taken out with Nco I and Pst I(Pst I is not unique). Other genes or gene fragments can be inserted inthe expression cassette and expressed.

We analyze the expression cassette in CF knockout mice. We generatetransgenic mice to express the human CFTR gene with the expressioncassettes. We introduce the CFTR expression cassette into CF knockoutmice by crossing the CFTR-expressing mice with the CF knockout mice andrescue the CF mice by expression of the human CFTR gene with theexpression cassette. In addition, we evaluate the expression cassettesfor CFTR expression by intratracheal or intravenous delivery of theplasmid DNA complexed with liposomes (Logan et al. 1995, Gene Therapy2:38-49; Liu et al. 1995, JBC 270:24864-24870).

Some of the changes described above to optimize CFTR expression may beomitted if a low level of CFTR expression is desired. For example, ifthe adenine to guanine change in the CFTR coding region is omitted, CFTRwill be produced, but at a lower level. Likewise, variations in thenumber of cytosine or adenine to thymine mutations may be made if CFTRexpression is not destroyed. It would be obvious to one skilled in theart that other changes could be made to alter the levels of expressionof CFTR.

The CFTR gene is one therapeutic protein which may be expressed in vivoor in vitro using the expression cassettes of the invention. Changes inthe nucleotide sequence which result in production of a chemicallyequivalent (for example, as a result of redudancy of the genetic code)or chemically similar amino acid (for example where sequence similarityis present), may also be used as therapeutic proteins with theexpression cassettes of the invention. For example, U.S. Pat. No.5,240,846 discloses mutants of the CFTR gene having a silent mutationthat stabilizes expression of the gene. U.S. Pat. No. 5,639,661discloses genes encoding novel CF monomer proteins which have cysticfibrosis transmembrane conductance regulator (CFTR) protein activity.

Other Therapeutic Protein Expression Cassettes

Other therapeutic proteins or mutants may also be used with thecassette. The expression cassettes may be used to drive expression ofthe cytokine genes, such as Interleukin 10 (de, V. J. 1995, Annals ofMedicine 27:537-541), to control inflammation in lung, or to driveexpression of DNA sequences encoding angiogenesis inhibitors, such asendostain (O'Reilly et al. 1997, Cell 88:277-285) and angiostain(O'Reilly et al. 1994, Cell 79:315-328) to inhibit tumor formation.These genes may be inserted in the cassette and expressed usingtechniques described in this application as well as other techniquesknown in the art. We analyze the expression cassette in a broad range ofcarcinoma cell lines and oncomice. We also evaluate the expressioncassette in cancer gene therapy by testing it in a variety of cancercell lines, including those from lung, breast, and colon carcinomas.

The expression cassettes are useful in other epithelial tissue, inaddition to lung epithelial tissue, because the K18 gene is expressed inthe epithelial cells of other internal organs (see Example 3). The DNAregulatory elements of the expression cassettes described below are alsouseful to direct tissue-specific expression of therapeutic genes inepithelial cells of other organs. However, successful expression of areporter gene in the right cell type by an expression vector does notguarantee a positive outcome when a therapeutic gene is inserted in thesame cassette if the DNA sequence of the therapeutic gene interfereswith transcription or subsequent RNA splicing. One skilled in the artcan modify the expression construct to accommodate a therapeutic gene.The level of expression activity of a modified construct can be measuredin an animal model according to methods known in the prior art.

Research Tool

Mammals and cells cultures transformed with the expression cassette ofthe invention are useful as research tools. Mammals and cell culturesare used in research according to numerous techniques known in the art.For example, one obtains mice that do not express CFTR and uses them inexperiments to assess CFTR gene expression. Experimental groups of miceare transformed with expression cassettes containing different types ofCFTR genes (or genes similar to CFTR or fragments of genes) to assessthe levels of protein produced, its functionality and the phenotype ofthe mice (for example, lung structure).

A cell line (either an imnmortalized cell culture or a primary cellculture) is transformed with an expression cassette of the inventioncontaining a CFTR gene (or variants) to measure levels of expression ofthe gene and the activity of the gene.

Using Exogenous Agents in Combination With an Expression Cassette

Cystic fibrosis-associated cells transformed with the cassetteexpressing CFTR may be treated with compounds that mobilize therecombinant protein (CFTR or a protein having similar sequence andfunction) as well as mutant forms of CFTR that may already be producedby the cells, so that the native and/or recombinant protein istransported to the plasma membrane and generates chloride channels inthe cells. U.S. Pat. No. 5,674,898 (Cheng et al.) discloses the use ofagents such as carboxylic acid or carboxylate which treat defectivechloride ion transport by mobilizing mutant CFTR protein.

Transplant of Cells Transformed With the Cassette

Cells transformed with an expression cassette of the invention may beused in epithelial tissue transplants according to techniques known inthe art. Examples of the use of transformed epithelial tissue intransplants are in U.S. Pat. Nos. 4,980,286 and 5,399,346.

Transgenic Mice and Rat Primary Cells as Models of Expression CassetteFunction in Humans

We used transgenic mice to evaluate the cell-specificity of theexpression cassette because this is the most reliable approach to theanalysis of mammalian gene expression at the whole organism level. Weused rat primary epithelial and fibroblast cells with high purity forthe evaluation because the freshly isolated primary cells retain theiroriginal properties better than the cultured cell lines. The transgenicmouse and rat primary cell models predict expression cassette functionin humans.

EXAMPLE 1 Development of Reporter Genes for Liposome-mediated PlasmidGene Transfer

For functional analysis of transcription regulatory elements, more thanone reporter gene system is normally required because an extra reportergene under a different promoter is needed to serve as an internalcontrol to normalize the effects resulted from variation intransfection. In addition a particular reporter gene may not becompatible with a particular expression cassette. Therefore, wedeveloped or adapted the following convenient reporter gene systems forlung gene expression studies:

1.1) GUS Reporter System.

A bacterial gene (E. coli GUS, coding for b-glucuronidase) has workedwell as a reporter gene in plants; its expression can be detected byeither highly sensitive chemiluminescent assays (2, 3) or cell staining(16). Although b-glucuronidase activity is present in some mammaliancells, the optimal pH value for the mammalian enzyme is around 4-5whereas that of the bacterial enzyme is around 7. We have subcloned theGUS gene into pCEP4 (Invitrogen) and transfected different cell linesand primary cells. We demonstrated that GUS can be a sensitive reporterfor quantification of gene expression in lung cells. In order to furtherimprove the sensitivity of GUS gene as a reporter, we added atranslational enhancer (18) and a DNA sequence encoding a nuclearlocalization signal (19) to the 5′ end of the GUS coding sequence. Asshown in FIG. 1, GUS expression was greatly enhanced. We test andoptimize the conditions for cell staining.

1.2) SEAP (secreted alkaline phosphatase) Reporter System.

We adapted SEAP as a primary reporter for gene expression in culturedcell lines and lung primary cells (FIGS. 3, 5 and 6) because the systemis more economical and less labor-intensive than CAT or other reportergene systems. The expression of SEAP can be quantified simply bychemiluminescent assay of the alkaline phosphatase secreted in culturemedia (2, 3).

1.3) GLP Reporter System.

We also adapted the Green Lantern Protein (GLP, a modified version ofgreen fluorescent protein) as a reporter to mark the cells transfectedwith liposome/DNA complex (FIG. 2).

The above systems may be modified and other markers may be incorporatedinto the expression cassettes.

EXAMPLE 2 Optimization of Transfection Conditions

We carried out experiments to optimize the transfection conditionsbecause liposome-mediated gene expression in cell lines of lung originis very inefficient. We used cell lines, such as, A549 (Human LungCarcinoma cell line), IB3 (Cystic fibrosis bronchial epithelial cellline transformed with adeno-12-SV40;(45)), COS7 (SV40 transformedAfrican Green monkey kidney), and W138 (Hunan Lung diploid of fibroblastorigin). There are many types of liposomes commercially available and aperson skilled in the art is able to select suibable liposomes. We usedDODAC:DOPE (INEX) because it is effective and large quantities will beavailable for clinical trials. For most of these cell lines, we foundthat about 2.5 nmol of DODAC:DOPE/cm² is optimal. FIG. 3 shows theeffect of DNA:lipid ratio on gene expression in A549 and COS7 cells.

EXAMPLE 3 Construction of K18 Expression Constructs

Cytokeratins are major components of the epithelial cytoskeleton anddifferent sub-types characterize different epithelia (26). Thecytokeratin 18 gene is expressed predominately in internal organs (lung,liver, kidney and intestine) and brain. It is highly epithelium-specificand has been a useful marker of epithelial cell transitions in theremodeling adult lung (42, 43). The 2.5 kb sequence from the 5′ regionis able to direct lung gene expression in a copy number-dependent andposition-independent manner in transgenic mice (28). Therefore, thisregion can be considered as a lung LCR (locus control region). A 3.5-kb3′ flanking sequence is required for gene expression in liver andintestine. There is a strong enhancer present in the first intron (29).To construct an expression cassette with the human cytokeratin 18 generegulatory elements, we isolated the K 18 minimal promoter, intron 1 andtwo 5′ fragments by PCR-cloning (FIG. 4). We found that any one of theelements alone could not direct SEAP expression in A549 or COS7 cells.The minimal promoter plus intron 1 has a low level of activity and thetwo fragments from the 5′ region can greatly enhance the level of geneexpression (FIG. 4). Since the 5′ region and the intron 1 of the K18gene are critical for high levels of gene expression, we decided to keepthese elements in their original configuration as much as possible inconstruction of our first expression cassette, K18EpilongSEAP, topreserve the potential interactions among the transcription factorsbound to these elements. In this reporter expression construct, thetranscription will start from the K18 promoter, but protein translationwill start from the first codon of the reporter gene because most of theK18 exon 1, including all the coding sequence, is deleted.

EXAMPLE 4 Episomal Expression of K18 Constructs in Cultured Cells

To show that the episomal expression directed by K18 regulatory elementshas epithelial specificity, we expressed K18EpiSEAP in A549 (human lungepithelial origin) and W138 (human lung fibroblast origin). As shown inFIG. 5, K18EpiSEAP expressed the reporter gene only in A549, but notWI38, while the viral promoter, CMV, expressed in both cell lines (FIG.5). The SV40 promoter was not active in these lung cell lines althoughit was fictional in COS7 cells which are monkey kidney cells transformedwith SV40 large T antigen. Our results showed that K18EpilongSEAP isabout 3 times more active than K18EpiSEAP (FIG. 15 ) and its expressionlasted much longer than the CMV promoter in cell lines (FIG. 6). Invivo, the low levels of long lasting expression of the CFTR gene byK18Epilong offers more clinical benefits to patients in lung genetherapy than the transient expression from viral promoters.k18EpilongSEAP also exhibit clear cell specificity in that itsexpression can only be detected in A549 cells, but not WI38 or anotherhuman lung fibroblast line, HLF (data not shown).

EXAMPLE 5 Expression of K18Epilong in Primary Lung Epithelial Cells

Because promoters active in cell lines are often not active in primarycells, we decided to test the K18EpilongSEAP in rat lung primary cells.Although the K18Epilong expression in cell lines was much lower thanthat of CMV promoter, its expression in rat lung primary cells wasbetter or comparable to that of CMV promoter (FIG. 7).

EXAMPLE 6 k18 CFTR Expression in Cell Lines and in Primary Cells

Because the K18Epilong can direct a high level of SEAP expression in ratlung primary cells, we built a CFTR expression cassette by replacing theSEAP coding sequence with CFTR cDNA to create K18EpilongTECFTR (FIG.4c). The CFTR gene contains 27 exons and 26 introns, spanning over 250kb on the long arm of human chromosome 7 (20, 30, 38); but the entirecoding sequence is about 4.5 kilobases in length. In order to enhanceCFTR protein synthesis, we added a translational enhancer (18) to the 5′end of the CFTR coding sequence and optimized the translation initiationsequence according to the Kozak sequence (23). To show that the CFTRgene was expressed from our expression cassette, we transfected COS7cells with K18EpilongCFTR. FIG. 8 shows that the transfected cells havecAMP-dependent iodide effluxes, indicating that the episomally expressedCFTR can form functional channels in transfected cells. But, theactivity of the CFTR channels was not as high as expected, indicatingthat the CFTR expression by this construct is not optimized. As shown inFIG. 9, we detected three CFTR mRNA species from transfected rat lungprimary cells or IB3 cells using RT-PCR, indicating that two cryptic RNAsplice-sites are activated; according to the sizes of the three PCRproducts, only about 25% of the mature CFTR species (the top band) areproperly processed. Therefore, we modified the construct to improve theRNA splicing efficiency.

EXAMPLE 7 Optimizing RNA Splicing

There is not much known about the regulation of RNA splicing in lungcells, despite the important role that splicing can play intissue-specific gene expression (35); e.g. the presence of rpL32 intron3, which does not contain an enhancer, led to a 30-fold increase in mRNArelative to the intronless rpL32 minigene (21). Although the mechanismfor stimulation of gene expression by regular introns is not clear, itis likely that the RNA splicing machinery may preferentially protect theintron-containing pre-mRNAs from nuclease degradation or facilitate thetransport of the spliced mRNAs to cytoplasm. Because intron 1 of thecytokeratin 18 gene contains a strong enhancer that is required for geneexpression, we included it in the K18-based CFTR expression cassette.But, incorporation of a heterologous intron into a cDNA sequence couldpotentially activate the cryptic splice-sites in the intron or in thecDNA and cause mis-splicing or alternative splicing. One potentialsolution to this problem is to put the intron after the coding sequenceof the cDNA as long as the intron and/or intron-containing enhancerworks from downstream. When the K18 intron 1 in K18EpilongSEAP is moveddown stream of the reporter gene, expression of the reporter gene isgreatly diminished (FIG. 16 ). Therefore, we modified the K18Epilongcftrto enhance the desired RNA splicing and to eliminate undesired RNAsplice-sites.

Typical eukaryotic introns contain relatively conserved, short sequencesrecognized by the splicing machinery, spliceosome (27). The consensussequences for the 5′ splice site, the branch site and 3′ splice site inmammals are AG/GURAGU, YNYURAC, and YAGM, respectively (R=purine,Y=pyrimidine, N=any nucleotide, and/indicates a splice site; theunderlined nucleotides are completely conserved). In addition, apolypyrimidine tract is often present near the 3′ splice site. WePCR-cloned the cDNA sequences derived from the alternatively splicedmRNAs and identified the splice-site junctions by DNA sequencing (FIG.10B). We then realized that the poly U (uracil) sequence is thepreferred polypyrimidine tract for the epithelial cells we used (FIG.10B). We also noticed that the K18 intron 5′ splice-site (AG/GUAAGG),putative branch-site (UUUUCAC), and 3′ splice-site (CAG/A) are nothighly conserved and can be potentially improved, since introns withmore conserved sequences are, in general, spliced more efficiently (21).We modified the DNA sequence of pK18EpilongTECFTR corresponding to thepolypyrimidine tract of the K18 intron 1 by changing five Cs (cytosineresidues) and three As (adenine residues) into Ts (thymine residues),which will be translated into Us in the pre-mRNA sequence (FIG. 10C). Wealso modified the 3′ splice site of the K18 intron by changing the firstnucleotide, A, of the following exon to G (FIG. 10C). Since thesenucleotides are not in the CFTR coding region, these changes would noteffect the protein produced from the expression plasmid. In addition, wehave made a single nucleotide change (A to G) in the CFTR coding region(see FIG. 10C) to destroy the second cryptic 3′ splice site. Weengineered the change in such a way so that the protein sequence remainsthe same and thus, the CFTR function will not be affected by thismodification. This new construct was designated K18 EpilongmCFTR, orpK18mCFTR [SEQ ID NO: 1], and the previous version of plasmid wasreferred as K18EpilongTECFTR. As shown in FIG. 11, these changes veryeffectively eliminated the alternative RNA splicing and increased thesteady state level of the CFTR mRNA.

To show that the new construct expresses functional CFTR channels, wetransfected COS7 cells and performed iodide efflux assays. As shown inFIG. 12, a higher level of CFTR channel activity was observed in cellstransfected with K18EpilongmCFTR [SEQ ID NO:1] than in cells transfectedwith the previous construct.

EXAMPLE 8 Expression Analysis of the K18 Regulatory Elements inTransgenic Mice.

To demonstrate that the modified K18 5′ regulatory elements and intron 1can direct cell-specific gene expression in lung epithelia in vivo, wecarried out a transgenic analysis (28). The transgenic fetuses wereidentified by PCR and Southern blot analyses of the genomic DNA; thelungs of the 14 day fetuses were dissected out and stained with X-galsolution. These modified K18 DNA regulatory elements direct efficientand cell-specific expression of E. coli LacZ gene in lungs of thetransgenic fetuses (FIGS. 13-14).

EXAMPLE 9 Expression in Calu-3 Cells

Since the human CFTR gene is heavily expressed in submucosal cells (12),we show that our epithelial expression cassettes function in thesecells. The current available cell line that resembles the humansubmucosal cells is Calu-3 which was derived from a lung adenocarcinoma(available from the American Type Culture Collection). These cellsexpress leukocyte protease inhibitor, lysozyme, and all markers ofserous gland cells (34). They also express a high level of CFTR and whenconfluent, show polarization typical of epithelia.

To show that our expression cassettes direct gene expression in Calu-3cells, we transfect these cells with K18EpilongSEAP and we performquantitative assays of secreted alkaline phosphatase activity. The SEAPreporter system is the most convenient assay system because only a smallamount of culture medium is required for each assay. The E. coli LacZgene is also a useful reporter.

EXAMPLE 10 Expression in Lung Sections

We show the activity of the expression cassettes in vivo. A recentlyrevived technique of lung slice culture (24, 39) is valuable forassessment of expression cassettes. Lungs of mice or rats are excisedfrom anesthetized animals and inflated with 2% liquid agarose at 37° C.through trachea. Following cooling to 4° C., the lungs are cut into 0.2to 1.0 mm thick slices and cultured overnight in cell culture medium.Cells in these lung slices can survive up to seven days (24, 39). Sincemore cell-cell interactions are maintained in the lung sections, geneexpression in these sections should have more relevance to the geneexpression in vivo. In addition to the preservation of cell-cellinteractions, there are other reasons for utilization of this method;the transfection conditions for lung slices can be easily controlled andone mouse lung can be sectioned into many slices for testing manyconstructs at once while more animals have to be used for the sameexperiment in vivo. The mouse lung slices are transfected withK18EpilongLacZ construct with DODAC:DOPE in the same way as for culturedcells (see above) in a 6 well dish and cultured at 37° C. for two days.We use the LacZ as a reporter in lung slices because its β-galactosidaseactivity can be easily measured with chemiluminescent assays as well ascell-staining with X-gal. The transfected tissue slices are homogenizedfor β-galactosidase activity assay or fixed for (i) cell staining, (ii)in situ hybridization to detect cell-specificity of RNA expression, and(iii) fluorescent immunostaining of reporter gene products (theanti-β-galactosidase antibody is available from Clontech).

EXAMPLE 11 Expression in Model Animals

Gene expression studies in model animals are necessary for anyexpression cassette to be used for gene therapy because regulation ofgene expression in model animals resembles that in human better than anyother in vitro systems. We transfect CD1 mice in triplicates withK18EpilongLacZ using an intra-tracheal instillation techniqueestablished by Dr. O'Brodovich's group (at the Hospital for SickChildren, Toronto, Canada) and others. Other transfection techniquesknown in the art may also be used. A negative control plasmid,K18Epilong (vector) is included in the study. The β-galactosidaseactivity in lung cells is determined initially 2 days after transfectionby using the chemiluminescent assays- To carry out a time course study,transfected mice are sacrificed at day 7, 14, 21, 28 post-transfectionand the β-galactosidase activity in lung cells is assayed.

The best animal models available for cystic fibrosis are the CFknock-out mice, that are available in the Hospital for Sick Children,animal facility (Toronto, Canada). We express K18EpilongCFTR in CFknock-out mice. Dr. O'Brodovich has confirmed the observations (14) thatUNC CF mice have a higher basal potential difference (PD) and fail tochange their PD in response to lowed lumenal chloride concentration. Wetransfect the UNC CF knockout mice with our CFTR expression constructthrough intra-tracheal instillation. A vector plasmid is used as anegative control. The cell-specific expression of the human CFTR mRNA isassessed by fluorescent in situ RT-PCR and the human CFTR protein isdetected by fluorescent immunostaining. Although there are not many highquality antibodies to CFTR available for in vivo detection, Demolombe etal. (8) have recently optimized the conditions for immmunofluorescentstaining of human CFTR with a monoclonal antibody, MATG 1031. We alsotransfect the UNC CF mice with the same CFTR construct by nasalinstillation and measure the nasal PD of the transfected mice.

EXAMPLE 12 Temporal Expression of the lacZ Gene

We analyzed temporal expression of the lacZ gene. FIG. 17 shows some ofthe results on β-galactosidase expression driven by K18 DNA regulatoryelements in K18EpilongmTELacZ [SEQ ID NO:19] at different gestationalages in the transgenic mice.

EXAMPLE 13 Submucosal Expression of the lacZ Reporter Gene

Human airway submucosal glands play a major role in maintaining thevolume and composition of airway surface fluid, which is important inairway clearance and protection from infection by microorganisms. FIG.18 shows that the expression cassette we developed (K18EpilongmTELacZ)[SEQ ID NO:19] can target the lacZ reporter gene expression tosubmucosal glands in the trachea of adult transgenic mice.

The expression cassettes of this invention may be used in epithelialtissue gene therapy, particularly lung epithelial tissue gene therapy.The pharmaceutical compositions of this invention used to treat patientshaving degenerative diseases, disorders or abnormal physical states ofthe epithelial tissue could include an acceptable carrier, auxiliary orexcipient. The conditions which may be treated by the expressioncassettes include cystic fibrosis, emphysema, and cancers of epithelialorigin arising in the lung or other organs.

The pharmaceutical compositions can be administered to humans or animalsby methods such as aerosol administration, intratracheal instillationand intravenous injection. Dosages to be administered depend on patientneeds, on the desired effect and on the chosen route of administration.The expression cassettes may be introduced into epithelial cells usingin vivo delivery vehicles such as liposomes. They may also be introducedinto these cells using physical techniques such as microinjection andelectroporation or chemical methods such as coprecipitation andincorporation of DNA into liposomes. The expression cassette may beintroduced into epithelial cells, such as submucosal cells, using thesetechniques. The expression cassettes may also be used in gene expressionstudies.

The pharmaceutical compositions can be prepared by known methods for thepreparation of pharmaceutically acceptable compositions which can beadministered to patients, and such that an effective quantity of theexpression cassette is combined in a mixture with a pharmaceuticallyacceptable vehicle. Suitable vehicles are described, for example inRemington's Pharmaceutical Sciences (Remington's PharmaceuticalSciences, Mack Publishing Company, Easton, Pa., USA).

On this basis, the pharmaceutical compositions could include an activecompound or substance, such as an episomal expression cassette and oneor more genes to be expressed, in association with one or morepharmaceutically acceptable vehicles or diluents, and contained inbuffered solutions with a suitable pH and isoosmotic with thephysiological fluids. The methods of combining the expression cassetteswith the vehicles or combining them with diluents is well known to thoseskilled in the art. The composition could include a targeting agent forthe transport of the active compound to specified sites within theepithelial tissue.

Materials and Methods

Construction of reporter gene and CFTR expression cassettes. Polymerasechain reactions (PCR) were performed with pfu polymerase (Stratagene)and primer pairs (K18P3-5′ GCAACGCGTCAGGTAAGGGGTAGG [SEQ ID NO:2]/K18P4-5′CGAAGATCTGGAGGGATTGTAGAGAG [SEQ ID NO: 3]),(K18XH5′-5′CATAATAACGTCATTTCCTGCCC [SEQ ID NO: 4]/K18P2*-5′GCTACGCGTGAGAGAAAGGACAGGACTC [SEQ ID NO: 5]),(K18NsiI-5′CTCACAGTAGGTGCTGAATGC [SEQ ID NO:6]/K18XH3′-5′GACACGGACAGCAGGTGTTGTTG [SEQ ID NO: 7]) K18P1-5′ CGAGGTACCAATAACAGTAAAAGGCAGTAC [SEQ ID NO:8]/K18NsiIR-5′CACCGGTATATCACCTTTCCTGC [SEQ ID NO: 9]) on genomic DNA ofhuman lung epithelial cells (A549) to isolate the first intron, minimalpromoter, and two 5′ untranslated regions, respectively, of human K18gene. PCR products were verified by restriction mapping, according torestriction patterns predicted from published sequence, before cloninginto the polylinker region of pSEAP(Tropix), via naturally-occurringrestriction sites or sites introduced by PCR primers. The primers werepurchased from ACGT Corp., Toronto and the PCR machine (DNA Engine,PTC-200), was purchased from Fisher.

The translation initiation sequence of the human CFTR cDNA was modifiedto introduce an Nco I site as well as to improve the initiation signal,according to Kozak's rule, by PCR using a cftrp1 primer (of sequence5′GAGACCATGGAGAGGTCG [SEQ ID NO: 10]). A linker containing the alfalfamosaic virus translational enhancer (TE) sequence (5′GTTTTTATTTTTAATTTTCTTTCAAATACTTCCA [SEQ ID NO: 11]) was insertedimmediately upstream of the Nco I site. The SEAP coding region inK18EpilongSEAP was then replaced with the TE4.6 kb CFTR cDNA fragment,resulting in the K18EpilongTECFTR construct.

PCR mutagenesis was performed on K18EpilongTECFTR using a 2-step nestedPCR strategy. First-round PCR reactions incorporate primer pairs(TE2-5′GTCCGCAAAGCCTGAGTCCTGTCC [SEQ ID NO:12]/K183′SS-5′AAATTAAAAATAAAAACAGACCTGAAAAAAAAAAGAGAGAGGTTGTT CCATGA[SEQ ID NO: 13]) and(TEtop-5′GATCTGTTTTTATTTTTAATTTTCTTTCAAATACTTCCACCATGGCCCC [SEQ ID NO:14]/cftr3′SS-5′GGTGACTTCCCCCAAATATAAAAAG [SEQ ID NO: 15]). Products fromthe first-round reactions were mixed and served as templates for thesecond-round PCR using TE2 [SEQ ID NO:15] and cftr3′SS primers. K18mCFTRconstruct was then generated by cloning the second-round PCR productback into K18EpilongTECFTR to replace the corresponding parentalfragment.

Tissue Culture and Transfection. A549, a human lung carcinoma cell line,and COS-7 cells were cultured in Dulbecco's modified Eagle's medium,supplemented with 10% fetal bovine serum (FBS). Human lung fibroblasts,W138, were maintained in alpha minimum essential medium (alpha-MEM) with10% FBS. IB3, a human cystic fibrosis bronchial epithelial cell line,was cultured in LHC-8 with 5% FBS. Day 19 rat fetal lung epithelium andfibroblast cells were isolated according to standard procedure andmaintained in alpha-MEM with 10% FBS.

For transfection, cells were seeded at 50-80% confluency in six-wellplates and allowed to settle in their regular media for overnight. Thecells were then transfected in serum-free media with lmg DNA premixedwith 12 mg of lipofectamine (GibcoBRL) per well according to therecommended procedure. Primary cells were transfected with premixedDNA:lipid complexes consisting of 1.66 mg DNA and 16.6 mg DODAC: DOPE(1:1 dioleyldimethylammonium chloride:dioleoylphosphatidylethanolamine,INEX) in serum-free media for 24 hr.

Reporter Assay. Culture media from transfected plates were collected atindicated time points post-transfection, before changes of media, andcentrifuged 1 min at 16,000×g. Supernatant was frozen at −80° C. orassayed immediately. Secreted alkaline phosphatase activities in themedia were detected with Phospha-Light chemiluminescent assay system(Tropix) as recommended and measured on a luminometer (BioOrbit).

Detection of CFTR mRNA. DNase I treated total RNA from transfectedcells, prepared with RNeasy column (Qiagen), was subjected to reversetranscription, followed by PCR (30 cycles) using TEI(5′CTGTCCTTTCTCTCACGCGTCAG [SEQ ID NO: 16]) or TE2 [SEQ ID NO:12] incombination with cftrp2 (5′GAGGAGTGCCACTTGC [SEQ ID NO: 17]) or cftrp3(5′GTTGTTGGAAAGGAGACTAACAAG [SEQ ID NO: 18]) primers.

Functional Analysis of CFTR Protein. Iodide efflux assays were performed48 hr post-transfection as previously described(5). Slight modificationswere made on compositions of the loading buffer, which is 136 mM Nal, 4mM KNO3, 2 mM Ca(NO₃)₂. 2 mM Mg(NO₃)₂. 11 mM glucose, and 20 mM HEPES,pH 7.4, and the agonists, 20 mM forskolin, 0.5 mM8-(4-chlorophenylthio)-adenosine 3′; 5′-cyclic monophosphate (CPT-cAMP),and 0.5 mM 3-isobutyl-1-methylxanthine (IBMX).

Production of Transgenic Mice. The k18mLacZ construct was constructed byreplacing the human CFTR coding region in the K18mCFTR plasmid with theE. coli LacZ gene. The K18mLacZ expression cassette was released bydigestion with Kpn I. The DNA fragments were separated by agarose gelelectrophoresis and purified through elutip (Schleicher & Schuell)following electroelution. The DNA fragments were microinjected into thepronuclei of fertilized eggs of STL/B16 mice. Fertilized eggs thatproceeded into 2-cell stage were transferred to pseudo-pregnant CD1recipients. The lungs of the 14 day fetuses were dissected out andstained with X-gal solution and the transgenic fetuses were identifiedby PCR and Southern blot analyses of the genomic DNA.

The present invention has been described in terms of particularembodiments found or proposed by the present inventor to comprisepreferred modes for the practice of the invention. It will beappreciated by those of skill in the art that, in light of the presentdisclosure, numerous modifications and changes can be made in theparticular embodiments exemplified without departing from the intendedscope of the invention. All such modifications are intended to beincluded within the scope of the appended claims.

All publications, patents and patent applications are hereinincorporated by reference in their entirety to the same extent as ifeach individual publication, patent or patent application wasspecifically and individually indicated to be incorporated by referencein its entirety. This application claims priority from Canadianapplication no. 2,205,076, which is incorporated by reference in itsentirety.

References

1. Baskar, J. F., P. P. Smith, G. Nilaver, R. A. Jupp, S. Hoffinann, N.J. Peffer, D. J. Tenney, P. A. Colberg, P. Ghazal and J. A. Nelson.1996. The enhancer domain of the human cytomegalovirus majorimmediate-early promoter determines cell type-specific expression intransgenic mice. Journal of Virology 70:3207-14.

2. Bronstein, I., J. Fortin, P. E. Stanley, G. S. Stewart and L. J.Kricka. 1994. Chemiluminescent and bioluminescent reporter gene assays.[Review]. Analytical Biochemistry 219:169-81.

3. Bronstein, I., J. J. Fortin, J. C. Voyta, R. R. Juo, B. Edwards, C.E. Olesen, N. Lijam and L. J. Kricka. 1994. Chemiluminescent reportergene assays: sensitive detection of the GUS and SEAP gene products.Biotechniques 17:172-4.

4. Caplen, N. J., E. W. Alton, P. G. Middleton, J. R. Dorin, B. J.Stevenson, X. Gao, S. R. Durham, P. K. Jeffery, M. E. Hodson, C.Coutelle and a.l. et. 1995. Liposome-mediated CFTR gene transfer to thenasal epithelium of patients with cystic fibrosis [see comments]. NatureMedicine 1:39-46.

5. Chang, X. B., J. A. Tabcharani, Y. X. Hou, T. J. Jensen, N. Kartner,N. Alon, J. W. Hanrahan and J. R. Riordan. 1993. Protein kinase A (PKA)still activates CFTR chloride channel after mutagenesis of all 10 PKAconsensus phosphorylation sites. Journal of Biological Chemistry268:11304-11.

6. Colledge, W. H. 1994. Cystic fibrosis gene therapy. [Review]. CurrentOpinion in Genetics & Development 4:466-71.

7. Crystal, R. G. 1995. Transfer of genes to humans: early lessons andobstacles to success. [Review]. Science 270:404-10.

8. Demolombe, S., I. Baro, Z. Bebok, J.-P. Clancy, E. J. Sorscher, A.Thomas-Soumarmon, A. Pavirani and D. Escande. 1996. A method for therapid detection of recombinant CPTR during gene therapy in cysticfibrosis. Gene Therapy 3:685-694.

9. Ellis, J., D. Talbot, N. Dillon and F. Grosveld. 1993. Synthetichuman beta-globin 5′HS2 constructs function as locus control regionsonly in multicopy transgene concatamers. Embo Journal 12:127-34.

10. Ellis, J., U. K. Tan, A. Harper, D. Michalovich, N. Yannoutsos, S.Philipsen and F. Grosveld. 1996. A dominant chromatin-opening activityin 5′hypersensitive site 3 of the human beta-globin locus controlregion. Embo Journal 15:562-8.

11. Engelhardt, J. F., R. H. Simon, Y. Yang, M. Zepeda, P. S. Weber, B.Doranz, M. Grossman and J. M. Wilson. 1993. Adenovirus-mediated transferof the CFTR gene to lung of nonhuman primates: biological efficacystudy. Human Gene Therapy 4:759-69.

12. Engelhardt, J. F. and J. M. Wilson. 1992. Gene therapy of cysticfibrosis lung disease. Journal of Pharmacy & Pharmacology 1:165-7.

13. Grosveld, F., M. Antoniou, M. Berry, B. E. De, N. Dillon, J. Ellis,P. Fraser, O. Hanscombe, J. Hurst, A. Imain and a. l. et. 1993. Theregulation of human globin gene switching. [Review]. PhilosophicalTransactions of the Royal Society of London Series B: BiologicalSciences 339:183-91.

14. Grubb, B. R., R. N. Vick and R. C. Boucher. 1994. Hyperabsorption ofNa+ and raised Ca(2+)-mediated Cl- secretion in nasal epithelia of CFmice. American Journal of Physiology 266:C1478-C1483.

15. Hazinski, T. A. 1993. Gene transfection of lung cells in vitro andin vivo. [Review]. Annual Review of Physiology 55:181-207.

16. Jefferson, R. A., T. A. Kavanagh and M. W. Bevan. 1987. GUS fusions:beta-glucuronidase as a sensitive and versatile gene fusion marker inhigher plants. Embo Journal 6:3901-7.

17. Jeong, S. and A. Stein. 1994. Micrococcal nuclease digestion ofnuclei reveals extended nucleosome ladders having anomalous DNA lengthsfor chromatin assembled on non-replicating plasmids in transfectedcells. Nucleic Acids Research 22:370-5.

18. Jobling, S. A. and L. Gehrke. 1987. Enhanced translation ofchimaeric messenger RNAs containing a plant viral untranslated leadersequence. Nature 325:622-5.

19. Kalderon, D., B. L. Roberts, W. D. Richardson and A. E. Smith. 1984.A short amino acid sequence able to specify nuclear location. Cell 39(3Pt2):499-509.

20. Kerem, B., J. M. Rommens, J. A. Buchanan, D. Markiewicz, T. K. Cox,A. Chakravarti, M. Buchwald and L. C. Tsui. 1989. Identification of thecystic fibrosis gene: genetic analysis. Science 245:1073-80.

21. Korb, M., Y. Ke and L. F. Johnson. 1993. Stimulation of geneexpression by introns: conversion of an inhibitory intron to astimulatory intron by alteration of the splice donor sequence. NucleicAcids Research 21:5901-8.

22. Korfhagen, T. R., S. W. Glasser and B. R. Stripp. 1994. Regulationof gene expression in the lung. [Review]. Current Opinion in Pediatrics6:255-61.

23. Kozak, M. 1991. Structural features in eukaryotic miRNAs thatmodulate the initiation of translation. [Review]. Journal of BiologicalChemistry 266:19867-70.

24. Kurosawa, H., C. G. Wang, R. J. Dandurand, M. King and D. H.Eidelman. 1995. Mucociliary function in the mouse measured in explantedlung tissue. Journal of Applied Physiology 79:41-6.

25. Mitchell, P. J. and R. Tjian. 1989. Transcriptional regulation inmammalian cells by sequence-specific DNA binding proteins. [Review].Science 245:371-8.

26. Moll, R., W. W. Franke, D. L. Schiller, B. Geiger and R. Krepler.1982. The catalog of human cytokeratins: patterns of expression innormal epithelia, tumors and cultured cells. [Review]. Cell 31:11-24.

27. Moore, M. J., C. C. Query and P. A. Sharp. 1993. Splicing ofprecursers to mRNA by the spliceosome. In The RNA World. chapter13:303-357.

28. Neznanov, N., I. S. Thorey, G. Cecena and R. G. Oshima. 1993.Transcriptional insulation of the human keratin 18 gene in transgenicmice. Molecular & Cellular Biology 13:2214-23.

29. Pankov, R., N. Neznanov, A. Umezawa and R. G. Oshima. 1994. AP-1,ETS, and transcriptional silencers regulate retinoic acid-dependentinduction of keratin 18 in embryonic cells. Molecular & Cellular Biology14:7744-57.

30. Romrnens, J. M., M. C. lannuzzi, B. Kerem, M. L. Drumm, G. Melmer,M. Dean, R. Rozmahel, J. L. Cole, D. Kennedy, N. Hidaka and a. l. et.1989.

Identification of the cystic fibrosis gene: chromosome walking andjumping. Science 245:1059-65.

31. Rooney, S. A., S. L. Young and C. R. Mendelson. 1994. Molecular andcellular processing of lung surfactant. [Review]. Faseb Journal8:957-67.

32. Sawaya, P. L., B. R. Stripp, J. A. Whitsett and D. S. Luse. 1993.The lung-specific CC10 gene is regulated by transcription factors fromthe AP-1, octamer, and hepatocyte nuclear factor 3 families. Molecular &Cellular Biology 13:3860-71.

33. Schmidt, E. V., G. Christoph, R. Zeller and P. Leder. 1990. Thecytomegalovirus enhancer: a pan-active control element in transgenicmice. Molecular & Cellular Biology 10:4406-11.

34. Shen, B. Q., W. E. Finkbeiner, J. J. Wine, R. J. Mrsny and J. H.Widdicombe. 1994. Calu-3: a human airway epithelial cell line that showscAMP-dependent Cl- secretion. American Journal of Physiology 266(5Pt1)::L493-501.

35. Smith, C. W., J. G. Patton and G. B. Nadal. 1989. Alternativesplicing in the control of gene expression. [Review]. Annual Review ofGenetics 23:527-77.

36. Talbot, D., P. Descombes and U. Schibler. 1994. The 5′ flankingregion of the rat LAP (C/EBP beta) gene can direct high-level,position-independent, copy number-dependent expression in multipletissues in transgenic mice. Nucleic Acids Research 22:756-66.

37. Tizzano, E. F., H. O'Brodovich, D. Chitayat, J. C. Benichou and M.Buchwald. 1994. Regional expression of CFTR in developing humanrespiratory tissues. American Journal of Respiratory Cell & MolecularBiology 10:355-62.

38. Tsui, L. C. 1995. The cystic fibrosis transmembrane conductanceregulator gene. [Review]. American Journal of Respiratory & CriticalCare Medicine 151:S47-53.

39. Vallan, C., R. R. Friis and P. H. Burri. 1995. Release of amitogenic factor by adult rat lung slices in culture. Experimental LungResearch 21:469-87.

40. Venkatesh, V. C., B. C. Planer, M. Schwartz, J. N. Vanderbilt, R. T.White and P. L. Ballard. 1995. Characterization of the promoter of humanpulmonary surfactant protein B gene. American Journal of Physiology268:L674-682.

41. Wert, S. E., S. W. Glasser, T. R. Korfhagen and J. A. Whitsett.1993. Transcriptional elements from the human SP-C gene directexpression in the primordial respiratory epithelium of transgenic mice.Developmental Biology 156:426-43.

42. Woodcock, M. J., K. B. Adler and R. B. Low. 1984.Inmunohistochemical identification of cell types in normal and inbleomycin-induced fibrotic rat lung. Cellular origins of interstitialcells. American Review of Respiratory Disease 130:910-6.

43. Woodcock, M. J., J. J. Mitchell, S. E. Reynolds, K. O. Leslie and R.B. Low. 1990. Alveolar epithelial cell keratin expression during lungdevelopment. American Journal of Respiratory Cell & Molecular Biology2:503-14.

44. Zabner, J., A. J. Fasbender, T. Moninger, K. A. Poellinger and M. J.Welsh. 1995. Cellular and molecular barriers to gene transfer by acationic lipid. Journal of Biological Chemistry 270:18997-9007.

45. Zeitlin, P. L., L. Lu, J. Rhim, G. Cutting, G. Stetten, K. A.Kieffer, R. Craig and W. B. Guggino. 1991. A cystic fibrosis bronchialepithelial cell line: immortalization by adeno-12-SV40 infection.American Journal of Respiratory Cell & Molecular Biology 4:313-9.

19 12143 base pairs nucleic acid double circular other nucleic acid/desc = “Mixture of genomic DNA, enhancer 8..2570 /standard_name= ”K18Enhancer/Promoter“ /note= ”DNA fragment was obtained by PCR-cloning andminor modifications were introduced for the purpose of PCR.“ intron2571..3318 /standard_name= ”K18 intron 1“ /note= ”DNA fragment wasobtained by PCR-cloning and modifications were introduced to improve thesplicing efficiency.“ enhancer 3319..3354 /standard_name= ”Alfalfamosaic virus translational enhancer“ /note= ”Fragment was synthesizedchemically.“ misc_feature 3355..7948 /standard_name= ”CFTR cDNA“misc_feature 7949..7984 /standard_name= ”pBluescript II KS(+) multiplecloning site“ intron 8507..8572 /standard_name= ”SV40 small t antigenintron“ polyA_signal 9178..9212 /standard_name= ”SV40 polyadenylationsignal“ polyA_signal 12021..12055 /standard_name= ”SV40 polyadenylationsignal“ rep_origin 9562..10205 /standard_name= ”pUC origin ofreplication“ misc_feature 11283..11353 /standard_name= ”Ampicillinresistance gene“ misc_feature 11345..11800 /standard_name= ”f1 singlestrand DNA origin“ 1 GGTACCAATA ACAGTAAAAG GCAGTACATA GCTTGTTGACTCCACATACT TTATTATAAA 60 ATACTGCCCA ACTTGACAGT TCTGGAATCC AGTGGGGGAATATAAAGGTG AAAGCAGGAG 120 AGACCCCTCT GACTGGAACC TCTTACCTCC CAGAAGCCTTGTATGCAAAA CCAGTGGGCA 180 TTCATTTGTA TGTTATTTTG CATCCCGTTT GCCTCCCAGCCTTCAGCAGG CCCCGACCCT 240 CCCCTGGCCA GCTTCCACCC TGACTGCCCC CTGGCTGGCTCCCATTGAGC ACTGTGGGCT 300 CTCCCCACCA TTAGGTGACA GATCAGGAAC AATCCAGGCTCAGGCTCTTT ATCTGTGCTC 360 TGCCTCCCAC CTGGCAGGTC CACTGGCCAG GCTTTTCCAGGGTCCCTTCT CTCCCAGGTC 420 TGCCCTACTA TTTGTCCTCC CCTTCCCCCT CAGCTGGTAGCTCGATAAGA ATCAATAGGT 480 CCACTCCAGA GCAAAGAACA CAGCCAAATG TGTCATACCAGGCCCTGCCA GAAAAACGAG 540 CTGCTGGAGC TGACAAACTT GAAGGCCAAA CACCTAAGGTTCCCCCCAAC ACTTCATTCA 600 GCAGGGATGG TCATTCAGCT TCAGGGGGCA GGCAGCATGAAAGCCTCCCT ACCTCCATCC 660 TTCTCACACA GAGGCTGGGG AGAGCATCTT GGAGGATGCAGTCCCCTGGG GCCAGGCTTC 720 TAATCCAGAC AGCCCTTACA AGGGGGGACA GGGGAAGGACTGGCTTGGAG AAAAGTCCTA 780 GAAAAGAGGG GAGGGGCACT GGCCACCAGG GCTGGGTCGCTGCTATGATG GTCCTAGGAG 840 TGCCTGCCTG TCCTCTCAGG CCCCATGCGA TGTAGGACACATTACTTTTA TTTATTTATT 900 TATTTATTTT GAGTCAGAGT TTCGCTCTGG TTGCCCAGGCTGGAGCGCGA CGGCACGATC 960 TTGGCTCACT GCAACCTCTG CCTCCTGGGT TCAAGCGATTCTCCTGCCTC AGCCTCCTGA 1020 GTAGCTGGGA TTACAGGCAC ACACTGTGCT GGTTAATTTTTGTATTTTTA GTAGAGAAGG 1080 GGTGTCACCA TGTTGGTCAG GCTGGTCTCA AATTTTTTTTTTTTTTTTTT TTTTTTTTTG 1140 AGACAGAGTC TTGCTCTGTT GTCTAGGCTG GAGTGCAGTGGCATCGAACT CTTGACCTCA 1200 AGTGATCCAC CCGCCTCGGC CTCCCAAAGT GCTTGGATTACAGGCATGAG CCACTGTGCC 1260 CGGCGATGTG GGACACATTA TCATCTCTGT GAGAGATTTTTGGTCTCTTT TGTCACCGCC 1320 CTTCTCTCCC AGCTCCTAGA ACTGGGCCTG GCTCACAGTAGGTGCTGAAT GCATACTGGT 1380 TGAATTGTAA ATGCTCAGGA TTTGTTTAAT TAAGGATGCAGGAAAGGTGA TATACCGGTG 1440 TGCAGAAGTC AGGATGCATT CCCTGTCCAA ATCACAGTGTTCCACTGAGG CAAGGCCCTT 1500 GGGAGTGAGG TCGGGAGAGG GGAGGGTGGT GGAGGGGGCTCAGAGACTGG GTTTGTTTTG 1560 GGGAGTCTGC ACCTATTTGC TGAGTGAATG TATGTGTGTGTGCATTTGAG AGCACACCTC 1620 TGTATGATTC GGGTGTGAGT GTGTGTGAGG AAACGTGGGCAGGCGAGGAG TGTTTGGGAG 1680 CCAGGTGCAG CTGGGGTGTG AGTGTGTAAG CAAGCAGCTATGAGGCTGGG CATTGCTTCT 1740 CCTCCTCTTC TCCAGCTCCC AGCCTTTCTT CCCCGGGACTCCTGGGGCTC CAGGATGCCC 1800 CCAAGATCCC CTCCACAAGT GGATAATTTG GGCTGCAGGTTAAGGACAGC TAGAGGGACT 1860 CACAGGCCAT TCCACCCGCA CACCACCAGA CCCCCAAATTTCTTTTTTCT TTTTTTTTTG 1920 AGACAGAGTC TCACTCTGTC GCCAGGCTGC AGTGGCGCGATCTCGGCTCA CTGCAACCTC 1980 CGCCTCCCAG GTTCAAGCGA TTCCCCTTCC TCAGCCTCCCAAGTAGCTGA GACTACAGGC 2040 GTGCACCATC ACGTCCGGCT AATTTTTTGT ATTTTAGTAGAGAGGGGTTT CACCATGTTG 2100 GCTAGGATGG TCTCGATCTC CTGACCTCGT GATCCGCCCACCTAGGCCTC CCAAAGTGCT 2160 GAGATTACAG GCGTGAGCCA CTGCGCCCGG TCAAGACTCCCAAATTTCAA ACTCGCCAGC 2220 ACCTCCTCCA CCTGGGGGAG AAGAGCATAA TAACGTCATTTCCTGCCCTG AAAGCAGCCT 2280 CGAGGGCCAA CAACACCTGC TGTCCGTGTC CATGCCCGGTTGGCCACCCC GTTTCTGGGG 2340 GGTGAGCGGG GCTTGGCAGG GCTGCGCGGA GGGCGCGGGGGTGGGGCCCG GGGCGGAGCG 2400 GCCCGGGGCG GAGGGCGCGG GCTCCGAGCC GTCCACCTGTGGCTCCGGCT TCCGAAGCGG 2460 CTCCGGGGCG GGGGCGGGGC CTCACTCTGC GATATAACTCGGGTCGCGCG GCTCGCGCAG 2520 GCCGCCACCG TCGTCCGCAA AGCCTGAGTC CTGTCCTTTCTCTCACGCGT CAGGTAAGGG 2580 GTAGGAGGGA CCTCAACTCC CAGCCTTGTC TGACCCTCCAATTATACACT CCTTTGCCTC 2640 TTTCCGTCAT TCCATAACCA CCCCAACCCC TACTCCACCGGGAGGGGGTT GGGCATACCT 2700 GGATTTCCAT CCGCGCACCT AGCCACAGGG TCCCTAAGAGCAGCAGCAGC TAGGCATGGG 2760 AGGGCTCTTT CCCAGGAGAG AGGGGGAAGG GGACAGGGTTGAGAGCTTTA CAGAGGAAGT 2820 GGACAGCATG GAGGGAGGTA AGGAAAGGCC TGTAAAGAGGAGGAGACACT GGCTCTGGCG 2880 GAATGGGGAC TATTGGAGGG TTAAGCGGAT GTGGCTAAGGCTGAGTCATC TAGGAGTAAA 2940 CAAGAGGCCT TCCTTTGGGA GGAGCCAATC CAGGGTGTAGGGGGCCCAGA GTGACCAGGT 3000 GCACTAGGGA AAAAATGCCA GGAGAGGGCC AGGAAGAGGACTTGTTAGTA GCGACTCACT 3060 TCTGGGCAGG CAGGCCAGCC AGCTAGCCAG CCTGCTGAGGCTTCCCAAGA GGGGCAGAGT 3120 GCTGGGATCT GGGAATCCAG GAAAGGAGGG AATGGGGTGGGGCTAGATGA AAAGGGATAG 3180 GTGTCCAGGG AGAGCCTCTG GCTATTCCTG GGACCAGGAAGTTTTCACTA GGATACATAA 3240 CACTTTTTAC ACACTCACCC CACCCATCCC TGGCTTTCTATTCATGGAAC AACCTCTCTC 3300 TTTTTTTTTT TCAGGTCTGT TTTTATTTTT AATTTTCTTTCAAATACTTC CACCATGGAG 3360 AGGTCGCCTC TGGAAAAGGC CAGCGTTGTC TCCAAACTTTTTTTCAGCTG GACCAGACCA 3420 ATTTTGAGGA AAGGATACAG ACAGCGCCTG GAATTGTCAGACATATACCA AATCCCTTCT 3480 GTTGATTCTG CTGACAATCT ATCTGAAAAA TTGGAAAGAGAATGGGATAG AGAGCTGGCT 3540 TCAAAGAAAA ATCCTAAACT CATTAATGCC CTTCGGCGATGTTTTTTCTG GAGATTTATG 3600 TTCTATGGAA TCTTTTTATA TTTGGGGGAA GTCACCAAAGCAGTACAGCC TCTCTTACTG 3660 GGAAGAATCA TAGCTTCCTA TGACCCGGAT AACAAGGAGGAACGCTCTAT CGCGATTTAT 3720 CTAGGCATAG GCTTATGCCT TCTCTTTATT GTGAGGACACTGCTCCTACA CCCAGCCATT 3780 TTTGGCCTTC ATCACATTGG AATGCAGATG AGAATAGCTATGTTTAGTTT GATTTATAAG 3840 AAGACTTTAA AGCTGTCAAG CCGTGTTCTA GATAAAATAAGTATTGGACA ACTTGTTAGT 3900 CTCCTTTCCA ACAACCTGAA CAAATTTGAT GAAGGACTTGCATTGGCACA TTTCGTGTGG 3960 ATCGCTCCTT TGCAAGTGGC ACTCCTCATG GGGCTAATCTGGGAGTTGTT ACAGGCGTCT 4020 GCCTTCTGTG GACTTGGTTT CCTGATAGTC CTTGCCCTTTTTCAGGCTGG GCTAGGAGAG 4080 ATGATGATGA AGTACAGAGA TCAGAGAGCT GGGAAGATCAGTGAAAGACT TGTGATTACC 4140 TCAGAAATGA TTGAAAATAT CCAATCTGTT AAGGCATACTGCTGGGAAGA AGCAATGGAA 4200 AAAATGATTG AAAACTTAAG ACAAACAGAA CTGAAACTGACTCGGAAGGC AGCCTATGTG 4260 AGATACTTCA ATAGCTCAGC CTTCTTCTTC TCAGGGTTCTTTGTGGTGTT TTTATCTGTG 4320 CTTCCCTATG CACTAATCAA AGGAATCATC CTCCGGAAAATATTCACCAC CATCTCATTC 4380 TGCATTGTTC TGCGCATGGC GGTCACTCGG CAATTTCCCTGGGCTGTACA AACATGGTAT 4440 GACTCTCTTG GAGCAATAAA CAAAATACAG GATTTCTTACAAAAGCAAGA ATATAAGACA 4500 TTGGAATATA ACTTAACGAC TACAGAAGTA GTGATGGAGAATGTAACAGC CTTCTGGGAG 4560 GAGGGATTTG GGGAATTATT TGAGAAAGCA AAACAAAACAATAACAATAG AAAAACTTCT 4620 AATGGTGATG ACAGCCTCTT CTTCAGTAAT TTCTCACTTCTTGGTACTCC TGTCCTGAAA 4680 GATATTAATT TCAAGATAGA AAGAGGACAG TTGTTGGCGGTTGCTGGATC CACTGGAGCA 4740 GGCAAGACTT CACTTCTAAT GATGATTATG GGAGAACTGGAGCCTTCAGA GGGTAAAATT 4800 AAGCACAGTG GAAGAATTTC ATTCTGTTCT CAGTTTTCCTGGATTATGCC TGGCACCAAA 4860 AAAGAAAATA TCATCTTTGG TGTTTCCTAT GATGAATATAGATACAGAAG CGTCATCAAA 4920 GCATGCCAAC TAGAAGAGGA CATCTCCAAG TTTGCAGAGAAAGACAATAT AGTTCTTGGA 4980 GAAGGTGGAA TCACACTGAG TGGAGGTCAA CGAGCAAGAATTTCTTTAGC AAGAGCAGTA 5040 TACAAAGATG CTGATTTGTA TTTATTAGAC TCTCCTTTTGGATACCTAGA TGTTTTAACA 5100 GAAAAAGAAA TATTTGAAAG CTGTGTCTGT AAACTGATGGCTAACAAAAC TAGGATTTTG 5160 GTCACTTCTA AAATGGAACA TTTAAAGAAA GCTGACAAAATATTAATTTT GAATGAAGGT 5220 AGCAGCTATT TTTATGGGAC ATTTTCAGAA CTCCAAAATCTACAGCCAGA CTTTAGCTCA 5280 AAACTCATGG GATGTGATTC TTTCGACCAA TTTAGTGCAGAAAGAAGAAA TTCAATCCTA 5340 ACTGAGACCT TACACCGTTT CTCATTAGAA GGAGATGCTCCTGTCTCCTG GACAGAAACA 5400 AAAAAACAAT CTTTTAAACA GACTGGAGAG TTTGGGGAAAAAAGGAAGAA TTCTATTCTC 5460 AATCCAATCA ACTCTATACG AAAATTTTCC ATTGTGCAAAAGACTCCCTT ACAAATGAAT 5520 GGCATCGAAG AGGATTCTGA TGAGCCTTTA GAGAGAAGGCTGTCCTTAGT ACCAGATTCT 5580 GAGCAGGGAG AGGCGATACT GCCTCGCATC AGCGTGATCAGCACTGGCCC CACGCTTCAG 5640 GCACGAAGGA GGCAGTCTGT CCTGAACCTG ATGACACACTCAGTTAACCA AGGTCAGAAC 5700 ATTCACCGAA AGACAACAGC ATCCACACGA AAAGTGTCACTGGCCCCTCA GGCAAACTTG 5760 ACTGAACTGG ATATATATTC AAGAAGGTTA TCTCAAGAAACTGGCTTGGA AATAAGTGAA 5820 GAAATTAACG AAGAAGACTT AAAGGAGTGC CTTTTTGATGATATGGAGAG CATACCAGCA 5880 GTGACTACAT GGAACACATA CCTTCGATAT ATTACTGTCCACAAGAGCTT AATTTTTGTG 5940 CTAATTTGGT GCTTAGTAAT TTTTCTGGCA GAGGTGGCTGCTTCTTTGGT TGTGCTGTGG 6000 CTCCTTGGAA ACACTCCTCT TCAAGACAAA GGGAATAGTACTCATAGTAG AAATAACAGC 6060 TATGCAGTGA TTATCACCAG CACCAGTTCG TATTATGTGTTTTACATTTA CGTGGGAGTA 6120 GCCGACACTT TGCTTGCTAT GGGATTCTTC AGAGGTCTACCACTGGTGCA TACTCTAATC 6180 ACAGTGTCGA AAATTTTACA CCACAAAATG TTACATTCTGTTCTTCAAGC ACCTATGTCA 6240 ACCCTCAACA CGTTGAAAGC AGGTGGGATT CTTAATAGATTCTCCAAAGA TATAGCAATT 6300 TTGGATGACC TTCTGCCTCT TACCATATTT GACTTCATCCAGTTGTTATT AATTGTGATT 6360 GGAGCTATAG CAGTTGTCGC AGTTTTACAA CCCTACATCTTTGTTGCAAC AGTGCCAGTG 6420 ATAGTGGCTT TTATTATGTT GAGAGCATAT TTCCTCCAAACCTCACAGCA ACTCAAACAA 6480 CTGGAATCTG AAGGCAGGAG TCCAATTTTC ACTCATCTTGTTACAAGCTT AAAAGGACTG 6540 TGGACACTTC GTGCCTTCGG ACGGCAGCCT TACTTTGAAACTCTGTTCCA CAAAGCTCTG 6600 AATTTACATA CTGCCAACTG GTTCTTGTAC CTGTCAACACTGCGCTGGTT CCAAATGAGA 6660 ATAGAAATGA TTTTTGTCAT CTTCTTCATT GCTGTTACCTTCATTTCCAT TTTAACAACA 6720 GGAGAAGGAG AAGGAAGAGT TGGTATTATC CTGACTTTAGCCATGAATAT CATGAGTACA 6780 TTGCAGTGGG CTGTAAACTC CAGCATAGAT GTGGATAGCTTGATGCGATC TGTGAGCCGA 6840 GTCTTTAAGT TCATTGACAT GCCAACAGAA GGTAAACCTACCAAGTCAAC CAAACCATAC 6900 AAGAATGGCC AACTCTCGAA AGTTATGATT ATTGAGAATTCACACGTGAA GAAAGATGAC 6960 ATCTGGCCCT CAGGGGGCCA AATGACTGTC AAAGATCTCACAGCAAAATA CACAGAAGGT 7020 GGAAATGCCA TATTAGAGAA CATTTCCTTC TCAATAAGTCCTGGCCAGAG GGTGGGCCTC 7080 TTGGGAAGAA CTGGATCAGG GAAGAGTACT TTGTTATCAGCTTTTTTGAG ACTACTGAAC 7140 ACTGAAGGAG AAATCCAGAT CGATGGTGTG TCTTGGGATTCAATAACTTT GCAACAGTGG 7200 AGGAAAGCCT TTGGAGTGAT ACCACAGAAA GTATTTATTTTTTCTGGAAC ATTTAGAAAA 7260 AACTTGGATC CCTATGAACA GTGGAGTGAT CAAGAAATATGGAAAGTTGC AGATGAGGTT 7320 GGGCTCAGAT CTGTGATAGA ACAGTTTCCT GGGAAGCTTGACTTTGTCCT TGTGGATGGG 7380 GGCTGTGTCC TAAGCCATGG CCACAAGCAG TTGATGTGCTTGGCTAGATC TGTTCTCAGT 7440 AAGGCGAAGA TCTTGCTGCT TGATGAACCC AGTGCTCATTTGGATCCAGT AACATACCAA 7500 ATAATTAGAA GAACTCTAAA ACAAGCATTT GCTGATTGCACAGTAATTCT CTGTGAACAC 7560 AGGATAGAAG CAATGCTGGA ATGCCAACAA TTTTTGGTCATAGAAGAGAA CAAAGTGCGG 7620 CAGTACGATT CCATCCAGAA ACTGCTGAAC GAGAGGAGCCTCTTCCGGCA AGCCATCAGC 7680 CCCTCCGACA GGGTGAAGCT CTTTCCCCAC CGGAACTCAAGCAAGTGCAA GTCTAAGCCC 7740 CAGATTGCTG CTCTGAAAGA GGAGACAGAA GAAGAGGTGCAAGATACAAG GCTTTAGAGA 7800 GCAGCATAAA TGTTGACATG GGACATTTGC TCATGGAATTGGAGCTCGTG GGACAGTCAC 7860 CTCATGGAAT TGGAGCTCGT GGAACAGTTA CCTCTGCCTCAGAAAACAAG GATGAATTAA 7920 GTTTTTTTTT AAAAAAGAAA CATTTGGGGA ATTCCTGCAGGAATTCGATA TCAAGCTTAT 7980 CGATATTGTT ACAACACCCC AACATCTTCG ACGCGGGCGTGGCAGGTCTT CCCGACGATG 8040 ACGCCGGTGA ACTTCCCGCC GCCGTTGTTG TTTTGGAGCACGGAAAGACG ATGACGGAAA 8100 AAGAGATCGT GGATTACGTC GCCAGTCAAG TAACAACCGCGAAAAAGTTG CGCGGAGGAG 8160 TTGTGTTTGT GGACGAAGTA CCGAAAGGTC TTACCGGAAAACTCGACGCA AGAAAAATCA 8220 GAGAGATCCT CATAAAGGCC AAGAAGGGCG GAAAGTCCAAATTGTAAAAT GTAACTGTAT 8280 TCAGCGATGA CGAAATTCTT AGCTATTGTA ATACTGCGATGAGTGGCAGG GCGGGGCGTA 8340 ATTTTTTTAA GGCAGTTATT GGTGCCCTTA AACGCCTGGTGCTACGCCTG AATAAGTGAT 8400 AATAAGCGGA TGAATGGCAG AAATTCGCCG GATCTTTGTGAAGGAACCTT ACTTCTGTGG 8460 TGTGACATAA TTGGACAAAC TACCTACAGA GATTTAAAGCTCTAAGGTAA ATATAAAATT 8520 TTTAAGTGTA TAATGTGTTA AACTACTGAT TCTAATTGTTTGTGTATTTT AGATTCCAAC 8580 CTATGGAACT GATGAATGGG AGCAGTGGTG GAATGCCTTTAATGAGGAAA ACCTGTTTTG 8640 CTCAGAAGAA ATGCCATCTA GTGATGATGA GGCTACTGCTGACTCTCAAC ATTCTACTCC 8700 TCCAAAAAAG AAGAGAAAGG TAGAAGACCC CAAGGACTTTCCTTCAGAAT TGCTAAGTTT 8760 TTTGAGTCAT GCTGTGTTTA GTAATAGAAC TCTTGCTTGCTTTGCTATTT ACACCACAAA 8820 GGAAAAAGCT GCACTGCTAT ACAAGAAAAT TATGGAAAAATATTCTGTAA CCTTTATAAG 8880 TAGGCATAAC AGTTATAATC ATAACATACT GTTTTTTCTTACTCCACACA GGCATAGAGT 8940 GTCTGCTATT AATAACTATG CTCAAAAATT GTGTACCTTTAGCTTTTTAA TTTGTAAAGG 9000 GGTTAATAAG GAATATTTGA TGTATAGTGC CTTGACTAGAGATCATAATC AGCCATACCA 9060 CATTTGTAGA GGTTTTACTT GCTTTAAAAA ACCTCCCACACCTCCCCCTG AACCTGAAAC 9120 ATAAAATGAA TGCAATTGTT GTTGTTAACT TGTTTATTGCAGCTTATAAT GGTTACAAAT 9180 AAAGCAATAG CATCACAAAT TTCACAAATA AAGCATTTTTTTCACTGCAT TCTAGTTGTG 9240 GTTTGTCCAA ACTCATCAAT GTATCTTATC ATGTCTGGATCCGTCGACCG ATGCCCTTGA 9300 GAGCCTTCAA CCCAGTCAGC TCCTTCCGGT GGGCGCGGGGCATGACTATC GTCGCCGCAC 9360 TTATGACTGT CTTCTTTATC ATGCAACTCG TAGGACAGGTGCCGGCAGCG CTCTTCCGCT 9420 TCCTCGCTCA CTGACTCGCT GCGCTCGGTC GTTCGGCTGCGGCGAGCGGT ATCAGCTCAC 9480 TCAAAGGCGG TAATACGGTT ATCCACAGAA TCAGGGGATAACGCAGGAAA GAACATGTGA 9540 GCAAAAGGCC AGCAAAAGGC CAGGAACCGT AAAAAGGCCGCGTTGCTGGC GTTTTTCCAT 9600 AGGCTCCGCC CCCCTGACGA GCATCACAAA AATCGACGCTCAAGTCAGAG GTGGCGAAAC 9660 CCGACAGGAC TATAAAGATA CCAGGCGTTT CCCCCTGGAAGCTCCCTCGT GCGCTCTCCT 9720 GTTCCGACCC TGCCGCTTAC CGGATACCTG TCCGCCTTTCTCCCTTCGGG AAGCGTGGCG 9780 CTTTCTCATA GCTCACGCTG TAGGTATCTC AGTTCGGTGTAGGTCGTTCG CTCCAAGCTG 9840 GGCTGTGTGC ACGAACCCCC CGTTCAGCCC GACCGCTGCGCCTTATCCGG TAACTATCGT 9900 CTTGAGTCCA ACCCGGTAAG ACACGACTTA TCGCCACTGGCAGCAGCCAC TGGTAACAGG 9960 ATTAGCAGAG CGAGGTATGT AGGCGGTGCT ACAGAGTTCTTGAAGTGGTG GCCTAACTAC 10020 GGCTACACTA GAAGGACAGT ATTTGGTATC TGCGCTCTGCTGAAGCCAGT TACCTTCGGA 10080 AAAAGAGTTG GTAGCTCTTG ATCCGGCAAA CAAACCACCGCTGGTAGCGG TGGTTTTTTT 10140 GTTTGCAAGC AGCAGATTAC GCGCAGAAAA AAAGGATCTCAAGAAGATCC TTTGATCTTT 10200 TCTACGGGGT CTGACGCTCA GTGGAACGAA AACTCACGTTAAGGGATTTT GGTCATGAGA 10260 TTATCAAAAA GGATCTTCAC CTAGATCCTT TTAAATTAAAAATGAAGTTT TAAATCAATC 10320 TAAAGTATAT ATGAGTAAAC TTGGTCTGAC AGTTACCAATGCTTAATCAG TGAGGCACCT 10380 ATCTCAGCGA TCTGTCTATT TCGTTCATCC ATAGTTGCCTGACTCCCCGT CGTGTAGATA 10440 ACTACGATAC GGGAGGGCTT ACCATCTGGC CCCAGTGCTGCAATGATACC GCGAGACCAC 10500 CGCTCACCGG CTCCAGATTT ATCAGCAATA AACCAGCCAGCCGGAAGGGC CGAGCGCAGA 10560 AGTGGTCCTG CAACTTTATC CGCCTCCATC CAGTCTATTAATTGTTGCCG GGAAGCTAGA 10620 GTAAGTAGTT CGCCAGTTAA TAGTTTGCGC AACGTTGTTGCCATTGCTAC AGGCATCGTG 10680 GTGTCACGCT CGTCGTTTGG TATGGCTTCA TTCAGCTCCGGTTCCCAACG ATCAAGGCGA 10740 GTTACATGAT CCCCCATGTT GTGCAAAAAA GCGGTTAGCTCCTTCGGTCC TCCGATCGTT 10800 GTCAGAAGTA AGTTGGCCGC AGTGTTATCA CTCATGGTTATGGCAGCACT GCATAATTCT 10860 CTTACTGTCA TGCCATCCGT AAGATGCTTT TCTGTGACTGGTGAGTACTC AACCAAGTCA 10920 TTCTGAGAAT AGTGTATGCG GCGACCGAGT TGCTCTTGCCCGGCGTCAAT ACGGGATAAT 10980 ACCGCGCCAC ATAGCAGAAC TTTAAAAGTG CTCATCATTGGAAAACGTTC TTCGGGGCGA 11040 AAACTCTCAA GGATCTTACC GCTGTTGAGA TCCAGTTCGATGTAACCCAC TCGTGCACCC 11100 AACTGATCTT CAGCATCTTT TACTTTCACC AGCGTTTCTGGGTGAGCAAA AACAGGAAGG 11160 CAAAATGCCG CAAAAAAGGG AATAAGGGCG ACACGGAAATGTTGAATACT CATACTCTTC 11220 CTTTTTCAAT ATTATTGAAG CATTTATCAG GGTTATTGTCTCATGAGCGG ATACATATTT 11280 GAATGTATTT AGAAAAATAA ACAAATAGGG GTTCCGCGCACATTTCCCCG AAAAGTGCCA 11340 CCTGACGCGC CCTGTAGCGG CGCATTAAGC GCGGCGGGTGTGGTGGTTAC GCGCAGCGTG 11400 ACCGCTACAC TTGCCAGCGC CCTAGCGCCC GCTCCTTTCGCTTTCTTCCC TTCCTTTCTC 11460 GCCACGTTCG CCGGCTTTCC CCGTCAAGCT CTAAATCGGGGGCTCCCTTT AGGGTTCCGA 11520 TTTAGTGCTT TACGGCACCT CGACCCCAAA AAACTTGATTAGGGTGATGG TTCACGTAGT 11580 GGGCCATCGC CCTGATAGAC GGTTTTTCGC CCTTTGACGTTGGAGTCCAC GTTCTTTAAT 11640 AGTGGACTCT TGTTCCAAAC TGGAACAACA CTCAACCCTATCTCGGTCTA TTCTTTTGAT 11700 TTATAAGGGA TTTTGCCGAT TTCGGCCTAT TGGTTAAAAAATGAGCTGAT TTAACAAAAA 11760 TTTAACGCGA ATTTTAACAA AATATTAACG TTTACAATTTCCCATTCGCC ATTCAGGCTA 11820 CGCAACTGTT GGGAAGGGCG ATCGGTGCGG GCCTCTTCGCTATTACGCCA GCCCAAGCTA 11880 CCATGATAAG TAAGTAATAT TAAGGTACGT GGAGGTTTTACTTGCTTTAA AAAACCTCCC 11940 ACACCTCCCC CTGAACCTGA AACATAAAAT GAATGCAATTGTTGTTGTTA ACTTGTTTAT 12000 TGCAGCTTAT AATGGTTACA AATAAAGCAA TAGCATCACAAATTTCACAA ATAAAGCATT 12060 TTTTTCACTG CATTCTAGTT GTGGTTTGTC CAAACTCATCAATGTATCTT ATGGTACTGT 12120 AACTGAGCTA ACATAACCCG GGA 12143 24 basepairs nucleic acid single linear - 1..24 /note= ”K18P3 synthetic DNAoligo-nucleotide - amplification primer for obtaining K18“ 2 GCAACGCGTCAGGTAAGGGG TAGG 24 26 base pairs nucleic acid single linear - 1..26/note= ”K18P4 synthetic DNA oligo-nucleotide - amplification primer forobtaining K18“ 3 CGAAGATCTG GAGGGATTGT AGAGAG 26 23 base pairs nucleicacid single linear - 1..23 /note= ”K18XH5′ synthetic DNAoligonucleotide - amplification primer for obtaining K18 “ 4 CATAATAACGTCATTTCCTG CCC 23 28 base pairs nucleic acid single linear - 1..28/note= ”K18P2* synthetic DNA oligo-nucleotide - amplification primer forobtaining K18“ 5 GCTACGCGTG AGAGAAAGGA CAGGACTC 28 21 base pairs nucleicacid single linear - 1..21 /note= ”K18NsiI synthetic DNAoligo-nucleotide - amplification primer for obtaining K18“ 6 CTCACAGTAGGTGCTGAATG C 21 23 base pairs nucleic acid single linear - 1..23 /note=”K18XH3′ synthetic DNA oligo-nucleotide - amplification primer forobtaining K18“ 7 GACACGGACA GCAGGTGTTG TTG 23 30 base pairs nucleic acidsingle linear - 1..30 /note= ”K18P1 synthetic oligo-nucleotide -amplification primer for obtaining K18“ 8 CGAGGTACCA ATAACAGTAAAAGGCAGTAC 30 23 base pairs nucleic acid single linear - 1..23 /note=”K18NsiIR synthetic DNA oligo-nucleotide - amplification primer forobtaining K18“ 9 CACCGGTATA TCACCTTTCC TGC 23 18 base pairs nucleic acidsingle linear - 1..18 /note= ”cftrp1 synthetic DNA oligonucleotide -amplification primer for PCR mutagenesis“ 10 GAGACCATGG AGAGGTCG 18 34base pairs nucleic acid single linear other nucleic acid - 1..34 /note=”TE“ 11 GTTTTTATTT TTAATTTTCT TTCAAATACT TCCA 34 24 base pairs nucleicacid single linear - 1..24 /note= ”TE2 synthetic DNA oligo-nucleotide -amplification primer for PCR mutagenesis“ 12 GTCCGCAAAG CCTGAGTCCT GTCC24 54 base pairs nucleic acid single linear - 1..54 /note= ”K183′SSsynthetic DNA oligo-nucleotide - amplification primer for PCRmutagenesis“ 13 AAATTAAAAA TAAAAACAGA CCTGAAAAAA AAAAAGAGAG AGGTTGTTCCATGA 54 49 base pairs nucleic acid single /note= ”TEtop synthetic DNAoligo-nucleotide - amplification primer for PCR mutagenesis“ - 1..49 14GATCTGTTTT TATTTTTAAT TTTCTTTCAA ATACTTCCAC CATGGCCCC 49 25 base pairsnucleic acid single linear - 1..25 /note= ”cftr3′SS synthetic DNAoligo-nucleotide - amplification primer for PCR mutagenesis“ 15GGTGACTTCC CCCAAATATA AAAAG 25 23 base pairs nucleic acid singlelinear - 1..23 /note= ”TE1 synthetic DNA oligo-nucleotide -amplification primer for PCR analysis of CFTR mRNA“ 16 CTGTCCTTTCTCTCACGCGT CAG 23 16 base pairs nucleic acid single linear - 1..16/note= ”cftrp2 synthetic DNA oligo-nucleotide - amplification primer forPCR analysis of CFTR mRNA“ 17 GAGGAGTGCC ACTTGC 16 24 base pairs nucleicacid single linear - 1..24 /note= ”cftrp3 synthetic DNAoligo-nucleotide - amplification primer for PCR analysis of CFTR mRNA“18 GTTGTTGGAA AGGAGACTAA CAAG 24 10504 base pairs nucleic acid doublecircular - /note= ”K18EpilongmTELacZ; Figure 20“ 19 GGTACCAATAACAGTAAAAG GCAGTACATA GCTTGTTGAC TCCACATACT TTATTATAAA 60 ATACTGCCCAACTTGACAGT TCTGGAATCC AGTGGGGGAA TATAAAGGTG AAAGCAGGAG 120 AGACCCCTCTGACTGGAACC TCTTACCTCC CAGAAGCCTT GTATGCAAAA CCAGTGGGCA 180 TTCATTTGTATGTTATTTTG CATCCCGTTT GCCTCCCAGC CTTCAGCAGG CCCCGACCCT 240 CCCCTGGCCAGCTTCCACCC TGACTGCCCC CTGGCTGGCT CCCATTGAGC ACTGTGGGCT 300 CTCCCCACCATTAGGTGACA GATCAGGAAC AATCCAGGCT CAGGCTCTTT ATCTGTGCTC 360 TGCCTCCCACCTGGCAGGTC CACTGGCCAG GCTTTTCCAG GGTCCCTTCT CTCCCAGGTC 420 TGCCCTACTATTTGTCCTCC CCTTCCCCCT CAGCTGGTAG CTCGATAAGA ATCAATAGGT 480 CCACTCCAGAGCAAAGAACA CAGCCAAATG TGTCATACCA GGCCCTGCCA GAAAAACGAG 540 CTGCTGGAGCTGACAAACTT GAAGGCCAAA CACCTAAGGT TCCCCCCAAC ACTTCATTCA 600 GCAGGGATGGTCATTCAGCT TCAGGGGGCA GGCAGCATGA AAGCCTCCCT ACCTCCATCC 660 TTCTCACACAGAGGCTGGGG AGAGCATCTT GGAGGATGCA GTCCCCTGGG GCCAGGCTTC 720 TAATCCAGACAGCCCTTACA AGGGGGGACA GGGGAAGGAC TGGCTTGGAG AAAAGTCCTA 780 GAAAAGAGGGGAGGGGCACT GGCCACCAGG GCTGGGTCGC TGCTATGATG GTCCTAGGAG 840 TGCCTGCCTGTCCTCTCAGG CCCCATGCGA TGTAGGACAC ATTACTTTTA TTTATTTATT 900 TATTTATTTTGAGTCAGAGT TTCGCTCTGG TTGCCCAGGC TGGAGCGCGA CGGCACGATC 960 TTGGCTCACTGCAACCTCTG CCTCCTGGGT TCAAGCGATT CTCCTGCCTC AGCCTCCTGA 1020 GTAGCTGGGATTACAGGCAC ACACTGTGCT GGTTAATTTT TGTATTTTTA GTAGAGAAGG 1080 GGTGTCACCATGTTGGTCAG GCTGGTCTCA AATTTTTTTT TTTTTTTTTT TTTTTTTTTG 1140 AGACAGAGTCTTGCTCTGTT GTCTAGGCTG GAGTGCAGTG GCATCGAACT CTTGACCTCA 1200 AGTGATCCACCCGCCTCGGC CTCCCAAAGT GCTTGGATTA CAGGCATGAG CCACTGTGCC 1260 CGGCGATGTGGGACACATTA TCATCTCTGT GAGAGATTTT TGGTCTCTTT TGTCACCGCC 1320 CTTCTCTCCCAGCTCCTAGA ACTGGGCCTG GCTCACAGTA GGTGCTGAAT GCATACTGGT 1380 TGAATTGTAAATGCTCAGGA TTTGTTTAAT TAAGGATGCA GGAAAGGTGA TATACCGGTG 1440 TGCAGAAGTCAGGATGCATT CCCTGTCCAA ATCACAGTGT TCCACTGAGG CAAGGCCCTT 1500 GGGAGTGAGGTCGGGAGAGG GGAGGGTGGT GGAGGGGGCT CAGAGACTGG GTTTGTTTTG 1560 GGGAGTCTGCACCTATTTGC TGAGTGAATG TATGTGTGTG TGCATTTGAG AGCACACCTC 1620 TGTATGATTCGGGTGTGAGT GTGTGTGAGG AAACGTGGGC AGGCGAGGAG TGTTTGGGAG 1680 CCAGGTGCAGCTGGGGTGTG AGTGTGTAAG CAAGCAGCTA TGAGGCTGGG CATTGCTTCT 1740 CCTCCTCTTCTCCAGCTCCC AGCCTTTCTT CCCCGGGACT CCTGGGGCTC CAGGATGCCC 1800 CCAAGATCCCCTCCACAAGT GGATAATTTG GGCTGCAGGT TAAGGACAGC TAGAGGGACT 1860 CACAGGCCATTCCACCCGCA CACCACCAGA CCCCCAAATT TCTTTTTTCT TTTTTTTTTG 1920 AGACAGAGTCTCACTCTGTC GCCAGGCTGC AGTGGCGCGA TCTCGGCTCA CTGCAACCTC 1980 CGCCTCCCAGGTTCAAGCGA TTCCCCTTCC TCAGCCTCCC AAGTAGCTGA GACTACAGGC 2040 GTGCACCATCACGTCCGGCT AATTTTTTGT ATTTTAGTAG AGAGGGGTTT CACCATGTTG 2100 GCTAGGATGGTCTCGATCTC CTGACCTCGT GATCCGCCCA CCTAGGCCTC CCAAAGTGCT 2160 GAGATTACAGGCGTGAGCCA CTGCGCCCGG TCAAGACTCC CAAATTTCAA ACTCGCCAGC 2220 ACCTCCTCCACCTGGGGGAG AAGAGCATAA TAACGTCATT TCCTGCCCTG AAAGCAGCCT 2280 CGAGGGCCAACAACACCTGC TGTCCGTGTC CATGCCCGGT TGGCCACCCC GTTTCTGGGG 2340 GGTGAGCGGGGCTTGGCAGG GCTGCGCGGA GGGCGCGGGG GTGGGGCCCG GGGCGGAGCG 2400 GCCCGGGGCGGAGGGCGCGG GCTCCGAGCC GTCCACCTGT GGCTCCGGCT TCCGAAGCGG 2460 CTCCGGGGCGGGGGCGGGGC CTCACTCTGC GATATAACTC GGGTCGCGCG GCTCGCGCAG 2520 GCCGCCACCGTCGTCCGCAA AGCCTGAGTC CTGTCCTTTC TCTCACGCGT CAGGTAAGGG 2580 GTAGGAGGGACCTCAACTCC CAGCCTTGTC TGACCCTCCA ATTATACACT CCTTTGCCTC 2640 TTTCCGTCATTCCATAACCA CCCCAACCCC TACTCCACCG GGAGGGGGTT GGGCATACCT 2700 GGATTTCCATCCGCGCACCT AGCCACAGGG TCCCTAAGAG CAGCAGCAGC TAGGCATGGG 2760 AGGGCTCTTTCCCAGGAGAG AGGGGGAAGG GGACAGGGTT GAGAGCTTTA CAGAGGAAGT 2820 GGACAGCATGGAGGGAGGTA AGGAAAGGCC TGTAAAGAGG AGGAGACACT GGCTCTGGCG 2880 GAATGGGGACTATTGGAGGG TTAAGCGGAT GTGGCTAAGG CTGAGTCATC TAGGAGTAAA 2940 CAAGAGGCCTTCCTTTGGGA GGAGCCAATC CAGGGTGTAG GGGGCCCAGA GTGACCAGGT 3000 GCACTAGGGAAAAAATGCCA GGAGAGGGCC AGGAAGAGGA CTTGTTAGTA GCGACTCACT 3060 TCTGGGCAGGCAGGCCAGCC AGCTAGCCAG CCTGCTGAGG CTTCCCAAGA GGGGCAGAGT 3120 GCTGGGATCTGGGAATCCAG GAAAGGAGGG AATGGGGTGG GGCTAGATGA AAAGGGATAG 3180 GTGTCCAGGGAGAGCCTCTG GCTATTCCTG GGACCAGGAA GTTTTCACTA GGATACATAA 3240 CACTTTTTACACACTCACCC CACCCATCCC TGGCTTTCTA TTCATGGAAC AACCTCTCTC 3300 TTTTTTTTTTTCAGGTCTGT TTTTATTTTT AATTTTCTTT CAAATACTTC CACCATGGCC 3360 AAGATCCCTCCTAAGAAGAA GCGCAAAGTC GAGGATCCCG TCGTTTTACA ACGTCGTGAC 3420 TGGGAAAACCCTGGCGTTAC CCAACTTAAT CGCCTTGCAG CACATCCCCC TTTCGCCAGC 3480 TGGCGTAATAGCGAAGAGGC CCGCACCGAT CGCCCTTCCC AACAGTTGCG CAGCCTGAAT 3540 GGCGAATGGCGCTTTGCCTG GTTTCCGGCA CCAGAAGCGG TGCCGGAAAG CTGGCTGGAG 3600 TGCGATCTTCCTGAGGCCGA TACTGTCGTC GTCCCCTCAA ACTGGCAGAT GCACGGTTAC 3660 GATGCGCCCATCTACACCAA CGTAACCTAT CCCATTACGG TCAATCCGCC GTTTGTTCCC 3720 ACGGAGAATCCGACGGGTTG TTACTCGCTC ACATTTAATG TTGATGAAAG CTGGCTACAG 3780 GAAGGCCAGACGCGAATTAT TTTTGATGGC GTTAACTCGG CGTTTCATCT GTGGTGCAAC 3840 GGGCGCTGGGTCGGTTACGG CCAGGACAGT CGTTTGCCGT CTGAATTTGA CCTGAGCGCA 3900 TTTTTACGCGCCGGAGAAAA CCGCCTCGCG GTGATGGTGC TGCGTTGGAG TGACGGCAGT 3960 TATCTGGAAGATCAGGATAT GTGGCGGATG AGCGGCATTT TCCGTGACGT CTCGTTGCTG 4020 CATAAACCGACTACACAAAT CAGCGATTTC CATGTTGCCA CTCGCTTTAA TGATGATTTC 4080 AGCCGCGCTGTACTGGAGGC TGAAGTTCAG ATGTGCGGCG AGTTGCGTGA CTACCTACGG 4140 GTAACAGTTTCTTTATGGCA GGGTGAAACG CAGGTCGCCA GCGGCACCGC GCCTTTCGGC 4200 GGTGAAATTATCGATGAGCG TGGTGGTTAT GCCGATCGCG TCACACTACG TCTGAACGTC 4260 GAAAACCCGAAACTGTGGAG CGCCGAAATC CCGAATCTCT ATCGTGCGGT GGTTGAACTG 4320 CACACCGCCGACGGCACGCT GATTGAAGCA GAAGCCTGCG ATGTCGGTTT CCGCGAGGTG 4380 CGGATTGAAAATGGTCTGCT GCTGCTGAAC GGCAAGCCGT TGCTGATTCG AGGCGTTAAC 4440 CGTCACGAGCATCATCCTCT GCATGGTCAG GTCATGGATG AGCAGACGAT GGTGCAGGAT 4500 ATCCTGCTGATGAAGCAGAA CAACTTTAAC GCCGTGCGCT GTTCGCATTA TCCGAACCAT 4560 CCGCTGTGGTACACGCTGTG CGACCGCTAC GGCCTGTATG TGGTGGATGA AGCCAATATT 4620 GAAACCCACGGCATGGTGCC AATGAATCGT CTGACCGATG ATCCGCGCTG GCTACCGGCG 4680 ATGAGCGAACGCGTAACGCG AATGGTGCAG CGCGATCGTA ATCACCCGAG TGTGATCATC 4740 TGGTCGCTGGGGAATGAATC AGGCCACGGC GCTAATCACG ACGCGCTGTA TCGCTGGATC 4800 AAATCTGTCGATCCTTCCCG CCCGGTGCAG TATGAAGGCG GCGGAGCCGA CACCACGGCC 4860 ACCGATATTATTTGCCCGAT GTACGCGCGC GTGGATGAAG ACCAGCCCTT CCCGGCTGTG 4920 CCGAAATGGTCCATCAAAAA ATGGCTTTCG CTACCTGGAG AGACGCGCCC GCTGATCCTT 4980 TGCGAATACGCCCACGCGAT GGGTAACAGT CTTGGCGGTT TCGCTAAATA CTGGCAGGCG 5040 TTTCGTCAGTATCCCCGTTT ACAGGGCGGC TTCGTCTGGG ACTGGGTGGA TCAGTCGCTG 5100 ATTAAATATGATGAAAACGG CAACCCGTGG TCGGCTTACG GCGGTGATTT TGGCGATACG 5160 CCGAACGATCGCCAGTTCTG TATGAACGGT CTGGTCTTTG CCGACCGCAC GCCGCATCCA 5220 GCGCTGACGGAAGCAAAACA CCAGCAGCAG TTTTTCCAGT TCCGTTTATC CGGGCAAACC 5280 ATCGAAGTGACCAGCGAATA CCTGTTCCGT CATAGCGATA ACGAGCTCCT GCACTGGATG 5340 GTGGCGCTGGATGGTAAGCC GCTGGCAAGC GGTGAAGTGC CTCTGGATGT CGCTCCACAA 5400 GGTAAACAGTTGATTGAACT GCCTGAACTA CCGCAGCCGG AGAGCGCCGG GCAACTCTGG 5460 CTCACAGTACGCGTAGTGCA ACCGAACGCG ACCGCATGGT CAGAAGCCGG GCACATCAGC 5520 GCCTGGCAGCAGTGGCGTCT GGCGGAAAAC CTCAGTGTGA CGCTCCCCGC CGCGTCCCAG 5580 GCCATCCCGCATCTGACCAC CAGCGAAATG GATTTTTGCA TCGAGCTGGG TAATAAGCGT 5640 TGGCAATTTAACCGCCAGTC AGGCTTTCTT TCACAGATGT GGATTGGCGA TAAAAAACAA 5700 CTGCTGACGCCGCTGCGCGA TCAGTTCACC CGTGCACCGC TGGATAACGA CATTGGCGTA 5760 AGTGAAGCGACCCGCATTGA CCCTAACGCC TGGGTCGAAC GCTGGAAGGC GGCGGGCCAT 5820 TACCAGGCCGAAGCAGCGTT GTTGCAGTGC ACGGCAGATA CACTTGCTGA TGCGGTGCTG 5880 ATTACGACCGCTCACGCGTG GCAGCATCAG GGGAAAACCT TATTTATCAG CCGGAAAACC 5940 TACCGGATTGATGGTAGTGG TCAAATGGCG ATTACCGTTG ATGTTGAAGT GGCGAGCGAT 6000 ACACCGCATCCGGCGCGGAT TGGCCTGAAC TGCCAGCTGG CGCAGGTAGC AGAGCGGGTA 6060 AACTGGCTCGGATTAGGGCC GCAAGAAAAC TATCCCGACC GCCTTACTGC CGCCTGTTTT 6120 GACCGCTGGGATCTGCCATT GTCAGACATG TATACCCCGT ACGTCTTCCC GAGCGAAAAC 6180 GGTCTGCGCTGCGGGACGCG CGAATTGAAT TATGGCCCAC ACCAGTGGCG CGGCGACTTC 6240 CAGTTCAACATCAGCCGCTA CAGTCAACAG CAACTGATGG AAACCAGCCA TCGCCATCTG 6300 CTGCACGCGGAAGAAGGCAC ATGGCTGAAT ATCGACGGTT TCCATATGGG GATTGGTGGC 6360 GACGACTCCTGGAGCCCGTC AGTATCGGCG GAATTCCAGC TGAGCGCCGG TCGCTACCAT 6420 TACCAGTTGGTCTGGTGTCA AAAATATCTT TGTGAAGGAA CCTTACTTCT GTGGTGTGAC 6480 ATAATTGGACAAACTACCTA CAGAGATTTA AAGCTCTAAG GTAAATATAA AATTTTTAAG 6540 TGTATAATGTGTTAAACTAC TGATTCTAAT TGTTTGTGTA TTTTAGATTC CAACCTATGG 6600 AACTGATGAATGGGAGCAGT GGTGGAATGC CTTTAATGAG GAAAACCTGT TTTGCTCAGA 6660 AGAAATGCGATCTAGTGATG ATGAGGCTAC TGCTGACTCT CAACATTCTA CTCCTCCAAA 6720 AAAGAAGAGAAAGGTAGAAG ACCCCAAGGA CTTTCCTTCA GAATTGCTAA GTTTTTTGAG 6780 TCATGCTGTGTTTAGTAATA GAACTCTTGC TTGCTTTGCT ATTTACACCA CAAAGGAAAA 6840 AGCTGCACTGCTATACAAGA AAATTATGGA AAAATATTCT GTAACCTTTA TAAGTAGGCA 6900 TAACAGTTATAATCATAACA TACTGTTTTT TCTTACTCCA CACAGGCATA GAGTGTCTGC 6960 TATTAATAACTATGCTCAAA AATTGTGTAC CTTTAGCTTT TTAATTTGTA AAGGGGTTAA 7020 TAAGGAATATTTGATGTATA GTGCCTTGAC TAGAGATCAT AATCAGCCAT ACCACATTTG 7080 TAGAGGTTTTACTTGCTTTA AAAAACCTCC CACACCTCCC CCTGAACCTG AAACATAAAA 7140 TGAATGCAATTGTTGTTGTT AACTTGTTTA TTGCAGCTTA TAATGGTTAC AAATAAAGCA 7200 ATAGCATCACAAATTTCACA AATAAAGCAT TTTTTTCACT GCATTCTAGT TGTGGTTTGT 7260 CCAAACTCATCAATGTATCT TATCATGTCT GGATCCCCAG GAAGCTCCTC TGTGTCCTCA 7320 TAAACCCTAACCTCCTCTAC TTGAGAGGAC ATTCCAATCA TAGGCTGCCC ATCCACCCTC 7380 TGTGTCCTCCTGTTAATTAG GTCACTTAAC AAAAAGGAAA TTGGGTAGGG GTTTTTCACA 7440 GACCGCTTTCTAAGGGTAAT TTTAAAATAT CTGGGAAGTC CCTTCCACTG CTGTGTTCCA 7500 GAAGTGTTGGTAAACAGCCC ACAAATGTCA ACAGCAGAAA CATACAAGCT GTCAGCTTTG 7560 CACAAGGGCCCGGTACCCGG GGATCCTCTA GAACTAGTGG ATCCCCCGGG CTGCAGGAAT 7620 TCGATATCAAGCTTATCGAT ACCGTCGACC GATGCCCTTG AGAGCCTTCA ACCCAGTCAG 7680 CTCCTTCCGGTGGGCGCGGG GCATGACTAT CGTCGCCGCA CTTATGACTG TCTTCTTTAT 7740 CATGCAACTCGTAGGACAGG TGCCGGCAGC GCTCTTCCGC TTCCTCGCTC ACTGACTCGC 7800 TGCGCTCGGTCGTTCGGCTG CGGCGAGCGG TATCAGCTCA CTCAAAGGCG GTAATACGGT 7860 TATCCACAGAATCAGGGGAT AACGCAGGAA AGAACATGTG AGCAAAAGGC CAGCAAAAGG 7920 CCAGGAACCGTAAAAAGGCC GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG 7980 AGCATCACAAAAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT 8040 ACCAGGCGTTTCCCCCTGGA AGCTCCCTCG TGCGCTCTCC TGTTCCGACC CTGCCGCTTA 8100 CCGGATACCTGTCCGCCTTT CTCCCTTCGG GAAGCGTGGC GCTTTCTCAT AGCTCACGCT 8160 GTAGGTATCTCAGTTCGGTG TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG CACGAACCCC 8220 CCGTTCAGCCCGACCGCTGC GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA 8280 GACACGACTTATCGCCACTG GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG 8340 TAGGCGGTGCTACAGAGTTC TTGAAGTGGT GGCCTAACTA CGGCTACACT AGAAGGACAG 8400 TATTTGGTATCTGCGCTCTG CTGAAGCCAG TTACCTTCGG AAAAAGAGTT GGTAGCTCTT 8460 GATCCGGCAAACAAACCACC GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG CAGCAGATTA 8520 CGCGCAGAAAAAAAGGATCT CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC 8580 AGTGGAACGAAAACTCACGT TAAGGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA 8640 CCTAGATCCTTTTAAATTAA AAATGAAGTT TTAAATCAAT CTAAAGTATA TATGAGTAAA 8700 CTTGGTCTGACAGTTACCAA TGCTTAATCA GTGAGGCACC TATCTCAGCG ATCTGTCTAT 8760 TTCGTTCATCCATAGTTGCC TGACTCCCCG TCGTGTAGAT AACTACGATA CGGGAGGGCT 8820 TACCATCTGGCCCCAGTGCT GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT 8880 TATCAGCAATAAACCAGCCA GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT GCAACTTTAT 8940 CCGCCTCCATCCAGTCTATT AATTGTTGCC GGGAAGCTAG AGTAAGTAGT TCGCCAGTTA 9000 ATAGTTTGCGCAACGTTGTT GCCATTGCTA CAGGCATCGT GGTGTCACGC TCGTCGTTTG 9060 GTATGGCTTCATTCAGCTCC GGTTCCCAAC GATCAAGGCG AGTTACATGA TCCCCCATGT 9120 TGTGCAAAAAAGCGGTTAGC TCCTTCGGTC CTCCGATCGT TGTCAGAAGT AAGTTGGCCG 9180 CAGTGTTATCACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC ATGCCATCCG 9240 TAAGATGCTTTTCTGTGACT GGTGAGTACT CAACCAAGTC ATTCTGAGAA TAGTGTATGC 9300 GGCGACCGAGTTGCTCTTGC CCGGCGTCAA TACGGGATAA TACCGCGCCA CATAGCAGAA 9360 CTTTAAAAGTGCTCATCATT GGAAAACGTT CTTCGGGGCG AAAACTCTCA AGGATCTTAC 9420 CGCTGTTGAGATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT TCAGCATCTT 9480 TTACTTTCACCAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC GCAAAAAAGG 9540 GAATAAGGGCGACACGGAAA TGTTGAATAC TCATACTCTT CCTTTTTCAA TATTATTGAA 9600 GCATTTATCAGGGTTATTGT CTCATGAGCG GATACATATT TGAATGTATT TAGAAAAATA 9660 AACAAATAGGGGTTCCGCGC ACATTTCCCC GAAAAGTGCC ACCTGACGCG CCCTGTAGCG 9720 GCGCATTAAGCGCGGCGGGT GTGGTGGTTA CGCGCAGCGT GACCGCTACA CTTGCCAGCG 9780 CCCTAGCGCCCGCTCCTTTC GCTTTCTTCC CTTCCTTTCT CGCCACGTTC GCCGGCTTTC 9840 CCCGTCAAGCTCTAAATCGG GGGCTCCCTT TAGGGTTCCG ATTTAGTGCT TTACGGCACC 9900 TCGACCCCAAAAAACTTGAT TAGGGTGATG GTTCACGTAG TGGGCCATCG CCCTGATAGA 9960 CGGTTTTTCGCCCTTTGACG TTGGAGTCCA CGTTCTTTAA TAGTGGACTC TTGTTCCAAA 10020 CTGGAACAACACTCAACCCT ATCTCGGTCT ATTCTTTTGA TTTATAAGGG ATTTTGCCGA 10080 TTTCGGCCTATTGGTTAAAA AATGAGCTGA TTTAACAAAA ATTTAACGCG AATTTTAACA 10140 AAATATTAACGTTTACAATT TCCCATTCGC CATTCAGGCT ACGCAACTGT TGGGAAGGGC 10200 GATCGGTGCGGGCCTCTTCG CTATTACGCC AGCCCAAGCT ACCATGATAA GTAAGTAATA 10260 TTAAGGTACGTGGAGGTTTT ACTTGCTTTA AAAAACCTCC CACACCTCCC CCTGAACCTG 10320 AAACATAAAATGAATGCAAT TGTTGTTGTT AACTTGTTTA TTGCAGCTTA TAATGGTTAC 10380 AAATAAAGCAATAGCATCAC AAATTTCACA AATAAAGCAT TTTTTTCACT GCATTCTAGT 10440 TGTGGTTTGTCCAAACTCAT CAATGTATCT TATGGTACTG TAACTGAGCT AACATAACCC 10500 GGGA 10504

We claim:
 1. An expression cassette for the episomal expression of atransgene in a targeted mammalian epithelial cell, comprising: acytokeratin 18 5′ region, a cytokeratin 18 promoter, and a cytokeratin18 intron 1, wherein the expression cassette is capable of receiving atransgene for expression in the epithelial cell, with the proviso thatthe transgene is not cytokeratin
 18. 2. The expression cassette of claim1, wherein the cytokeratin 18 intron 1 comprises all or part ofnucleotide nos. 2566 to 3315 of SEQ ID NO:1.
 3. The expression cassetteof claim 1, whwerein the cytokeratin 18 5′ region and cytokeratin 18promoter comprise all or part of nucleotide nos. 1 to 2565 of SEQ IDNO:1.
 4. The expression cassette of claim 1, further comprising atranslational enhancer.
 5. The expression cassette of claim 4, whereinthe translational enhancer comprises all or part of nucleotide nos. 3316to 3354 of SEO ID NO:1.
 6. The expression cassette of claim 1,comprising all or part of nucleotide nos. 1 to 3354 and 7949 to 12143 ofSEQ ID NO:1.
 7. The expression cassette of claims 1, 2, 3, 4, 5 or 6,further comprising a transgene selected from the group consisting of (i)a nucleotide sequence comprising all or part of nucleotide nos. 3355 to7955 of SEQ ID NO:1 (ii) a nucleotide sequence having at least 80%sequence identity with nucleotide nos. 3355 to 7955 of SEQ ID NO:1 and(iii) a nucleotide sequence encoding the same polypeptide that isencoded by nucleotide nos. 3355 to 7955 of SEQ ID NO:1.
 8. Theexpression cassette of claim 7, wherein the cassette is modified in anintron selected from the group consisting of the k18 intron I 3′ splicesite, a CFTR intron 3′ cryptic splice site-1 and a CFTR intron 3′cryptic splice site-2 thereby reducing alternative RNA splicing andincreasing the steady state level of mRNA produced from the CFTRsequence.
 9. The expression cassette of claim 8, wherein the transgenecomprises all or part of nucleotide numbers 3355 to 7955 of SEQ ID NO:1and an intron selected frorm the group consisting of nucleotides 3301 to3304, 3306 to 3308, 3310, 3315 and 3624 of SEQ ID NO:1.
 10. Theexpression cassette of claim 2, comprising all or part of the nucleotidesequence of SEQ ID NO:1.
 11. An expression cassette comprising at least80% sequence identity with a sequence selected from the group consistingof nucleotide nos. 1 to 3354, 7949 to 12143 of SEQ ID NO:1 and SEQ IDNO:1.
 12. A vector comprising the expression cassette of claim
 1. 13.The expression cassette of claim 1, wherein the epithelial cell is alung epithelial cell.
 14. The expression cassette of claim 13, whereinthe lung epithelial cell is an airway epithelial cell or a submucosalcell.
 15. A mammalian epithelial cell comprising the expression cassetteof claim
 1. 16. The mammalian epithelial cell of claim 15, wherein saidmammalian epithelial cell is a human cell.
 17. A composition comprisingan effective amount of the expression cassette of claim 7 and apharmaceutically acceptable carrier.
 18. A composition comprising theexpression cassette of claim 7 and a carrier.
 19. A method forexpressing a gene in a target epithelial cell, comprising: administeringto the epithelial cell an amount of the expression cassette of claims 1,2, 3, 4, 5, 6, 9, 10 or 11 so that the expression cassette is insertedin the epithelial cell; and expressing a protein encoded by thetransgene to produce the protein.
 20. A method for increasing chloridechannels in an epithelial cell, comprising: administering to theepithelial cell an amount of the expression cassette of claims 1, 2, 3,4, 5, 6, 9, 10 or 11 so that the expression cassette is inserted in theepithelial cell; and expressing a protein encoded by the transgene toproduce the protein, wherein said protein is transported to the plasmamembrane and generates chloride channels in the epithelial cell.